Special Relativity 


Chapter 1 ®) 
Mathematical Part om 


1.1. Introduction 


As will become clear from the first chapters of the book, the theories of Physics 
which study motion have a common base and structure and they are not indepen- 
dent and unrelated considerations, which at some limit simply produce the same 
numerical results. The differentiation between the theories of motion is due either 
to the mathematical quantities they use or in the way they describe motion or, both. 

The Theory of Special Relativity was the first theory of Physics which introduced 
different mathematics from those of Newtonian Physics and a new way of describing 
motion. A result of this double differentiation was the creation of an obscurity 
concerning the “new” mathematics and the “strange” physical considerations, which 
often led to mistaken understandings of both. 

For this reason the approach in this book is somewhat different from the one 
usually followed in the literature. That is, we present first the necessary mathematics 
per se without any reference to the physical ideas. Then the physical principles 
of the physical theory of Special Relativity are stated and the theory is developed 
conceptually. Finally the interrelation of the two parts is done via the position vector 
and the description of “motion” in spacetime. In this manner the reader avoids 
the “paradoxes” and other misunderstandings resulting partially from the “new” 
mathematics and partially from remnants of Newtonian ideas in the new theory. 

Following the above approach, in the first chapter we present, in a concise man- 
ner, the main elements of the mathematical formalism required for the development 
and the discussion of the basic concepts of Special Relativity. The discussion gives 
emphasis to the geometric role of the new geometric objects and their relation to 
the mathematical consistency of the theory rather than to the formalism. Needless 
to say, that for a deeper understanding of the theory of Special Relativity and the 
subsequent transition to the Theory of General Relativity, or even to other more 
specialized areas of relativistic Physics, it is necessary that the ideas discussed in 
this chapter be enriched and studied in greater depth. 


2 1 Mathematical Part 


The discussion in this chapter is organized as follows. At first we recall certain 
elements from the theory of linear spaces and coordinate transformations. We 
define the concepts of dual space and dual basis. We consider the linear coordinate 
transformations and the group GL(n, R), whose action preserves the linear structure 
of the space. Subsequently we define the inner product and produce the general 
isometry equation (orthogonal transformations) in a real linear metric space. Up to 
this point the discussion is common for both Euclidian geometry and Lorentzian 
geometry (i.e. Special Relativity). The differentiation starts with the specification 
of the inner product. The Euclidian inner product defines the Euclidian metric, 
the Euclidian space, the Euclidian Cartesian coordinate systems, the Euclidian 
Orthogonal transformations and finally the Euclidian tensors. Similarly the Lorentz 
inner product defines the Lorentz metric, the Minkowski space or spacetime (of 
Special Relativity), the Lorentz Cartesian coordinate systems, the Lorentz transfor- 
mations and finally the Lorentz tensors. The parallel development of Newtonian 
Physics and the Theory of Special Relativity in their mathematical and physical 
structure is at the root of our approach and will be followed throughout the 
book. 


1.2 Elements From the Theory of Linear Spaces 


Although the basic notions of linear spaces are well known it would be useful to 
refer to some basic elements from an angle suitable to the physicist. In the following 
we consider a linear (real) space of dimension three, but the results apply to any 
(real) linear space of finite dimension. 

The elements of a linear space V> can be described in terms of the elements of 
the linear space R? if we define in V? a basis. Indeed if {e,} (uw = 1, 2, 3) is a basis 
in V? then the vector v € V? can be written: 


3 
v=) ve, (1.1) 
p=1 


where v“ € R (wu = 1, 2, 3) are the components of v in the basis {e,,}. If in the space 
V> there are functions {x} (uw = 1, 2, 3) such that aon =e, (u = 1, 2,3) then the 
functions {x} are called coordinate functions in V? and the basis {e,,} is called 
a holonomic basis. In V” there are holonomic and non-holonomic bases. In the 
following by the term basis we shall always mean a holonomic basis. Furthermore 
the Greek indices jz, v,... take the values 1, 2, 3 except if specified differently. 


1.2 Elements From the Theory of Linear Spaces 3 
1.2.1 Coordinate Transformations 


In a linear space there are infinitely many coordinate systems or, equivalently, bases. 
If {e,,},{e,’} are two bases, a vector v € V? is decomposed as follows: 


v=) ve = > vey (1.2) 


where v’, v” are the components of v in the bases {e,,},{e,,} respectively. The 
bases {e,,},{e,’} are related by a linear coordinate transformation or change of 
basis defined by the expression: 


3 
ey = ane pw’ =1,2,3. (1.3) 
w=1 


L 


if the quantities a yw are numbers.' The nine numbers ay define a 3 x 3 matrix 


A= (a) whose determinant does not vanish. The non-singular matrix A is called 
the transformation matrix from the basis {e,,} to the basis {e,,}. The matrix A is 
not in general symmetric and we must consider a convention as to which index of 
ay counts rows and which columns. In this book we make the following convention 
concerning the indices of matrices: 


Notation 1.2.1 (Convention of matrix indices) 


1. The basis vectors we shall write as the 1 x 3 matrix (e;, e2,e3) = [e,]. In 
general the lower indices will count columns. 

2. The components v! v2, v3 of a vector v in the basis [e,] we shall write as the 
3 x 1 matrix: 


In general the upper indices will count rows. 


According to the above convention we write the 3 x 3 matrix A = [a" ,| as 
follows: 
al, al, al, 
A= ae, aay ayy : (1.4) 
a a, a 


‘In general elements of the field over which the vector space is defined. 


4 1 Mathematical Part 


In this notation the vector v is written as a product of matrices as follows: 
v= [e, |[v"]. 


3 
We note that this form is simpler than the previous v= )° v“e,, because it does 
w=1 
not have the )* symbol. However it is still elaborate because of the brackets of the 
matrices. In order to save writing space and time and make the expressions simpler 


and more functional we make the following further convention: 


Notation 1.2.2 (Einstein’s summation convention) When in a mononym (i.e. a 
simple term) an index is repeated as upper index and lower index, then it will be 
understood that the index is summed over all its values and will be called a dummy 
index. If we do not want summation over a particular repeated index, then we must 
specify this explicitly. 


3 
Therefore instead of )° a”b, we shall simply write ab ,,. According to the 
w=1 
convention of Einstein, relation (1.3) is written as: 


ey = 4) Uy. (1.5) 


« . / , 7. si 
Similarly for the vector v we have v = v“e,, = v“e, = v“ a “e,, which gives: 
Le Le pl = 


vt = Ce 

We note that the matrices of the coordinates transform differently form the matrices 
of the bases (in the left hand side we have v“ and not vl’), This leads us to 
consider two types of vector quantities. The covariant with low indices (which 
transform as the matrices of bases) and the contravariant with upper indices (which 
transform as the matrices of the components). Furthermore we name the upper 
indices contravariant indices and the lower indices covariant indices. 

The sole difference between covariant and contravariant indices is their behavior 
under successive coordinate transformations. Indeed let (a” Ww? (a" w) be two 


successive changes of basis. The composite transformation (aw) is defined by the 
product of matrices: 


or simply: 


at = a” a We (1.6) 


1.2 Elements From the Theory of Linear Spaces 5 


Concerning the bases we have in a profound notation: 
[e'] =[e]A, [e”"] =[e]A’ > 
[e”] = [e]AA’ (1.7) 
whereas for the coordinates: 
[v] = Alv'], [v’]=A’lv’]> 
[v"] = A’"' AM! [v] = (AA)! fv]. (1.8) 
Relations (1.7) and (1.8) show the difference in the behavior of the two types of 


indices under composition of coordinate transformations. 
Let V3 be a linear space and V** the set of all linear maps U of V? into R: 


U(au+ by) =aU(u)+bU(v) Va, be R,uveV?. 
The set V** becomes a linear space if we define the R—linear operation: 
(aU +bV)(u) =aU(u)+bV(u) -Va,be R, we V°,U,VevV™. 


This new linear space V** we call the dual space of V*. The dimension of V** 
equals the dimension of V+. In every basis [e,,] of V? there corresponds a unique 
basis of V>*, which we call the dual basis of le, I, write” as [e”] and define as 
follows: 


e (ey) = by 


where 5}; is the delta of Kronecker for which all diagonal entries are | and all others 
0. It is easy to show that to every coordinate transformation A = [ai] of V? there 


corresponds a unique coordinate transformation of V**, which is represented by the 


/ 
inverse matrix A~!. This matrix we agree to write as w\,. Then the corresponding 
transformation of the dual basis is written: 


wk’ av 
e =a"e. 
As a result of this last convention we have the following “orthogonality relation” for 
the matrix of a coordinate transformation A: 


oth att in = oi 


The notation of the dual basis with the same symbol as the corresponding basis of V? but with 
upper index, is justified by the fact that the dual space of the dual space is the initial space, that is, 
(v>*)* = V3. Therefore we need only two positions in order to differentiate the bases and this is 
achieved with the change of the position of the corresponding indices. 


6 1 Mathematical Part 


As we remarked above. In a linear space there are linear and non-linear coordinate 
transformations. The linear transformations f : V? — V? are defined as follows: 


f(au+ bv) =af(u)+bf(v)  a,be Ruuve V?. 


They preserve the linear structure of V* and they are described by matrices with 
constant coefficients (i.e. real numbers). Geometrically they preserve the straight 
lines and the planes. The non-linear transformations do not preserve the linear 
structures such as the straight lines and the planes. 

As atule in the following we shall consider the linear transformations only.* The 
set of all linear transformations of V? we shall write as L(V). 

In a basis [e,,] of V? a linear map f € L(V*) is represented by a 3 x 3 


matrix (f *,), which we call the representation of / in the basis [e,,]. There are 
two ways to relate the matrix (f *’) with a transformation. Either to consider that 


( fi ‘) defines a transformation of coordinates [v!’] — [v"] or to assume that 
it defines a transformation of bases [e,,] — [e,’]. In the first case we say that 
we have an active interpretation of the transformation and in the second case 
a passive interpretation of the transformation. For finite dimensional spaces the 
two interpretations are equivalent. In the following we shall follow the trend in the 
literature and select the active interpretation. 

The set L(V?) of all linear maps of a linear space V? into itself becomes a linear 
space of dimension 37 = 9, if we define the operation: 


(Af +ug)(v) =af(v)+ugtv) Vf,geL(V%),rA, WER, VEV?. 


Furthermore the set L(V*) has the structure of a group with operation the 
composition of transformations (defined by the multiplication of the representative 
matrices)* This group we denote by GL(3, R) and call the General Linear Group 
in 3—dimensions. 


3However we shall consider non-linear transformations when we study the 4-acceleration (see 
Sect. 7.11). 

4A group G is a set of elements (a, b, .. .), in which we have defined a binary operation “o” which 
satisfies the following relations: 


(a) Vb,c € G the element boc eG. 

(b) There exists a unique unit element e € G such that Va € G: eoa =aoe =a. Wecalle the 
unit element of G. 

(c) Vb € G there exists a unique element c € G such that boc = cob = e. The element c we 
call the inverse of b and write it as b~!. 


1.3. Inner Product: Metric 7 
1.3 Inner Product: Metric 


The general linear transformations are not useful in the study of linear spaces in 
practice, because they describe nothing more but the space itself. For this reason 
we use the space as the substratum onto which we define various geometric (and 
non-geometric) mathematical structures, which can be used in various applications. 
These new structures inherit the linear structure of the background space in the 
sense (to be clarified further down) that they transform in a definite manner under the 
action of certain subgroups of transformations of the general linear group GL(3, R). 

The most fundamental new structure on a linear space is the inner product and 
is defined as follows. 


Definition 1.3.1 A map p : V* x V? — Ris an inner product? on the linear space 
V? if it satisfies the following properties: 


a. p(u, Vv) = (Vv, u) 
B. p(uut vv, w) = we(u, w) + vp(v, Ww), Vu,v, we V*,W,vER. 


Obviously p( , ) is symmetric® and R—linear. A linear space endowed with 
an inner product will be called a metric space or a space with a metric. The inner 
product we shall indicate in general with a dot p(u, v) = u- v. The length of the 
vector u with reference to the inner product is defined by the relation u? = 
u-u= ~(u, u). 

In a linear space it is possible to define many inner products. The inner product 
for which u2 > 0 Vu € V? and u* = 0 > u=0 we call the Euclidian inner 
product and denote by -¢ or by a simple dot if it is explicitly understood. The 
Euclidian inner product is unique in the property uv? > 0 Vu € V> — {0}. In all other 
inner products the length of a vector can be positive, negative or zero. 

To every inner product in V? we can associate in each basis [e,,] of V? the 3 x3 
symmetric matrix: 


“oo 


Suv = ey ey. (1.9) 


The matrix g,,, we call the representation of the inner product in the basis 
[e,,]. The inner product will be called non-degenerate if det[g,,,] 4 0. All inner 
products we shall consider in this book are non-degenerate. 

A basis [e,] of V> will be called g-Cartesian or g-orthonormal if the 
representation of the inner product in this basis is gy, = diag(+1,+1,...). 
Obviously there are infinite g—orthonormal bases for every inner product. We have 
the following result. 


5Note that we do not require the inner product to be positive definite, because this is not true for 
the Lorentzian inner product which can be > 0, < 0 or = 0. 

The inner product in general is not necessarily symmetric. In this book all inner products are 
symmetric. 


8 1 Mathematical Part 


Proposition 1.3.1 For every (non-degenerate) inner product there exist g— 
orthonormal bases (Gram — Schmidt Theorem) and furthermore the number of 
+1 and —1 is the same for every g—orthonormal basis and it is characteristic of 
the inner product (Theorem of Sylvester). If r is the number of —1 and s the number 
of +1 the number r — s we call the character of the inner product. 


As it is well known from Linear Algebra a non-degenerate symmetric matrix can 
always be brought to diagonal form with elements +1 by means of a congruent 
transformation.’ This form of the matrix is called the canonical form . The 
congruent transformation which brings a non-degenerate symmetric matrix to its 
canonical form is not unique. In fact for each non-degenerate symmetric matrix there 
is a group of congruent transformations which brings the matrix into its canonical 
form. Under the action of this group the reduced form of the matrix remains the 
same. 

Let g,,, be the representation of the inner product in a general basis and Gyo _ the 
representation in a g—orthonormal basis. If fh is the transformation which relates 
the two bases then it is easy to show the relation g,) = ome racr™ f°. This 
relation implies that the transformation f io is a congruent transformation, therefore 
it always exists and can be found with well known methods. 

We conclude that in a four dimensional space we have at most three distinct 
classes of inner products.® 


* The Euclidian inner product with character —4 and canonical form gjj = 
diag(1, 1,1, 1) 

* The Lorentz inner product with character —2 and canonical form gjj = 
diag(—1, 1, 1, 1) 

¢ The inner product (without a specific name) with character 0 and canonical form 
gij = diag(—1, —1, 1, 1). 


A linear space (of any finite dimension 1) endowed with the Euclidian inner 
product we call Euclidian space and denote E”. A linear space of dimension four 
endowed with the Lorentz inner product we call spacetime or Minkowski space 
and write as M*. Newtonian Physics uses the Euclidian space E? and the Theory 
of Special Relativity the Minkowski space M7‘. 


7Two n x n matrices A and B over the same field K are called congruent if there exists an 
invertible n x n matrix P over K such that P‘'AP = B. A congruent transformation is a mapping 
whose transformation matrix is P. Congruent matrices share many properties: they have the same 
rank, the same determinant, the same trace, the same eigenvalues (but not necessarily the same 
eigenvectors), the same characteristic polynomial and the same minimal polynomial. Congruent 
matrices can be thought of as describing the same linear map in different bases. Because of this, 
for a given matrix A, one is interested in finding a simple “normal form” B which is congruent to 
A and reduce the study of the matrix A. One such form is the canonical form of a non-degenerate 
symmetric matrix A mentioned above. 

8The inner product gj; = diag(1,—1,—1,—1) is the same as the gj; = (—1,1,1, 1) after 
multiplying the entire matrix by —1. 


1.3. Inner Product: Metric 9 


Every inner product in the space V° induces a unique inner product in the dual 
space V** as follows. Because the inner product is a linear function it is enough 
to define its action on the basis vectors. Let [e,] be a basis of V> and [e“] its 
dual in V>*. We define the matrix g“” of the induced inner product in V** by the 
requirement: 


gi? sete = [gu]! 
or equivalently: 
Surg = 5h. (1.10) 


For the definition to be acceptable it must be independent of the particular basis 
[e”]. To achieve that we demand that in any other coordinate system [e’“] holds”: 


n ! 


Syivig’ © = 8. (1.11) 


In order to find the transformation of the matrices g,,, g*” under coordinate 
transformations we consider two bases [e,,] and [e,,"], which are related with the 
transformation f ee 


ew = ft ep: 
Using relation (1.9) and the linearity of the inner product we have: 


Sy'v! — ey! ey = fur fy eu ep => 
7 
Sp'v'! = ta fy Suv- (1.12) 
We conclude that under coordinate transformations the matrix g,,, is transformed 
homogeneously (there is no constant term) and linearly (there is only one g,,, in the 


rhs). This type of transformation we call tensorial and shall be used extensively in 
the following. 


Exercise 1.3.1 Show that the induced matrix g"” in the dual space transforms 
tensorially under coordinate transformations in that space. 


Because the transformation matrices ti ap if are inverse to one another they 
satisfy the relation: 


Pp ¢p! fh 
en LA Ase (1.13) 


°The quantity ap, has the same components in all coordinate systems, that is, 1 if 4’ = p’ and 0 
when py’ ¥ p’. It is called the Kronecker delta. 


10 1 Mathematical Part 


which can be written as: 
oe = Foe ol ar (1.14) 


The last equation indicates that the 3 x 3 matrices 5? under coordinate transfor- 
mations transform tensorially. The representation of the inner product in the dual 
spaces V> and V>* with the inverse matrices Suv, g”” respectively, the relation of 


the dual bases with the matrix 5!’ and the fact that the matrices Sy'p's gle’ and 
ar , transform tensorially under coordinate transformations lead us to the conclusion 


that the three matrices gj), g"”, 54 are the representations of one and the same 
geometric object. This new geometric object we call the metric of the inner product. 
Thus the Euclidian inner product corresponds/leads to the Euclidian metric gz, and 
the Lorentz inner product to the Lorentz metric g;. 

The main role of the metric in a linear space is the selection of the g—Cartesian 
or g—orthonormal bases and coordinate systems, in which the metric has its 
canonical form gy, = diag(+1,+1,...). For the Euclidian metric gz we have 
the Euclidian Cartesian frames and for the Lorentz metric gz, we have the Lorentz 
Cartesian frames. In the following in order to save writing we shall refer to the first 
as ECF and the second as LCF. 

As we shall see later the ECF correspond to the Newtonian Inertial frames and 
the LCF to the relativistic inertial frames of Special Relativity. 

Let K(g,e) be the set of all g—Cartesian bases of a metric g. This set can be 
generated from any of its elements by the action of a proper coordinate transfor- 
mation. These coordinate transformations we call g—isometries or g—orthogonal 
transformations. The g—isometries are all'° transformations of the linear space 


which leave the canonical form of the metric the same. If [f] = (f is i) is a 
g—isometry between the bases e,,,e,/ € K(g,e) and [g] is the canonical form 
of g, we have the relation (congruent transformation): 


Lf" fellf7') = [e’I. (1.15) 


Equation (1.15) is important because it is a matrix equation whose solution 
gives the totality (of the group) of (linear) isometries of the metric g. For 
this reason we call equation (1.15) the fundamental equation of isometry.!! 
The set of all (linear) isometries of a metric of a linear metric space V3 isa 
closed subgroup of the general linear group GL(3, R). From equation (1.15) it 
follows that the determinant of a g—orthogonal transformation equals +1. Due 
to that we distinguish the g—orthogonal transformations in two large subsets. The 


10Tn the following in general we shall restrict our considerations to linear transformations. However 
non linear transformations are possible. For example we shall consider non-linear transformations 
in Sect. 7.11 where we associate accelerated motions with the conformal transformations. 


'I This is essentially Killing’s equation in a linear space. 


1.4 Tensors 11 


proper g—orthogonal transformations with determinant +1 and the improper 
g—orthogonal transformations with determinant —1. The proper g—orthogonal 
transformations form a group whereas the improper do not (why?). Every metric 
has its own group of proper orthogonal transformations. For the Euclidian metric 
this group is the Eucledian group £ (3) and for the Lorentz metric the Poincaré 
group. 

The group of g—orthogonal transformations in a linear space of dimension n has 
dimension n(n + 1)/2. This means that any element of the group can be described 
in terms of mat) parameters, called group parameters.!* From these n(= dim V) 
refer to “translations” along the coordinate lines and the rest math -—n= meh 
to “rotations” in the corresponding planes of the coordinates. How these parameters 
are used to describe motion in a given theory is postulated by the kinematics of the 
theory. The Euclidian group has dimension 6 (3 translations and 3 rotations) and the 
Poincaré group dimension 10 (4 translations and 6 rotations). The rotations of the 
Poincaré group form a closed subgroup called the Lorentz group. 

We conclude that the role of a metric is manifold. From the set of all bases of the 
space selects the g—Cartesian bases and from the set of all linear (non-degenerate) 
automorphisms of the space the g—orthogonal transformations. Furthermore the 
metric specifies the g—tensorial behavior which will be used in the next section 
to define geometric objects more general than the metric and compatible with the 
linear and the metric structure of the space. 


1.4 Tensors 


The vectors and the metric of a linear space are geometric objects, with one and 
two indices respectively which transform tensorially.'> The question arises if we 
need to consider geometric objects with more indices, which under the action of 
GL(n, R) or of a given subgroup G of GL(n, R), transform tensorially. The study 
of Geometry and Physics showed that this is imperative. For example the curvature 
tensor is a geometric object with four indices. These new geometric objects we call 
with a collective name tensors and are the basic tools of both Geometry and Physics. 
Of course in mathematical Physics one needs to use geometric objects which are not 
tensors, but this will not concern us in this book. 

We start our discussion of tensors without specifying either the dimension of the 
space or the subgroup G of GL(n, R), therefore the results apply to both Euclidian 


!2We recall that translation is a transformation of the form u > uw’ = u+a Vu, uw’ € V° and 
rotation is a transformation of the form u > u’ = Au where the matrix A satisfies the fundamental 
equation of isometry (1.15) andu, u’ € V?. The translations follow in an obvious way and we need 
only to compute the rotations. 

13 geometric object is a quantity which has a specific transformation law (not necessarily linear 
and homogeneous i.e. tensorial) under coordinate transformations. 


12 1 Mathematical Part 


tensors and Lorentz tensors. Let G be a group of linear transformations of a linear 
space V” and let F(V”) be the set of all bases of V”. We consider an arbitrary basis 
in F(V”) and construct all the bases which are obtained from that basis under the 
action of the group G. This action selects in general a subset of bases in the set 
F(V”). We say that the bases in this set are G— related and we call them G—bases. 
From the subset of remaining bases we select a new basis, and repeat the procedure 
and so on. Eventually we end up with a set of sets of bases so that the bases in 
each set are G—related and bases in different sets are not related with an element 
of G.'* We conclude that the existence of a group G of coordinate transformations 
in V” makes possible the division of the bases and the coordinate systems V” in 
classes of G—equivalent elements. The choice of G—bases in a linear space makes 
possible the definition of G—equivalent geometric objects in that space, by means 
of the following definition. 


Definition 1.4.1 Let G be a group of linear coordinate transformations of a linear 
space V” and suppose that the element of G which relates the G—bases [e] — [e’] 
is represented with the n x n matrix A?” . We define a G—tensor of order (r, 5) and 


write as Tier a geometric object which: 
Ji-Js 


1. Has n’** components of which n” are G—contravariant (upper indices ij . . . i) 
and n* are G—covariant (lower indices j) ... js) 

2. The n’** components under the action of the element an’ of the group G 
transform tensorially, that is: 


gine = Aj! me be rs (1.16) 


wip 
Js 
components in the G—related basis [e’]. 


Psi 
are the components of the tensor in the basis [e] and ee the 
ee 


where c 

From the definition it follows that if we know a tensor in one G—basis then we 
can compute it in any other G—basis using (1.16) and the matrix Ae representing 
the element of G relating the two bases. Therefore we can divide the set of all 
tensors, 7(V”) say, on the linear space V”, in sets of G—tensors in the same manner 
we divided the set of bases and the set of coordinate systems in G—bases and 
G—coordinate systems respectively. In conclusion: 


For every subgroup G of GL(n, R) we have G— bases, G— coordinate systems and G— 
tensors. 


The discussion so far has not specified either the group G or the dimension of 
the space n. In order to define a specific group G of transformations in a linear space 
we must introduce a geometric structure in the space whose symmetry group will 


'4To be precise using G we define an equivalence relation in F(V”) whose classes are the subsets 
mentioned above. 


1.4 Tensors 13 


be the group G. Without getting into details, we consider in the linear space V” the 
structure inner product. As we have shown the inner product defines the group of 
isometries in V”, which can be used as the group G. In that spirit the Euclidian inner 
product defines the Euclidian tensors and the Lorentz inner product the Lorentz 
tensors. 

The selection of the dimension of the linear space and the group of coordinate 
transformations G by a theory of Physics, is made by means of principles, which 
satisfy certain physical criteria. 


a. The group of transformations G attains physical meaning only after a correspon- 
dence has been defined between the G—coordinate systems and the characteristic 
frames of reference of the theory and, 

b. The physical quantities of the theory are described in terms of G—tensors. 


As a result of the tensorial character it is enough to give a physical quantity in 
one characteristic frame of the theory!> and compute it in any other frame (without 
any further experimentation or measurements!) using the appropriate element of 
G relating the two frames. This procedure achieves the “de-personalization” of 
Physics, that is, all frames (observers) are “equal” and defines the “objectivity” of 
the theories of Physics. We shall refer to that topic more in Chap. 2, when we discuss 
the covariance principle. 

An important class of G—tensors which deserves special reference are the 
G-—invariants. These are tensors of class (0,0), so that they have no indices and 
under G—transformations retain their value, that is, their transformation is: 


a’ =a. (1.17) 


It is important to note that a scalar is not necessarily invariant under a group G 
whereas an invariant is always a scalar. For example the time ¢ is a scalar and 
invariant in Newtonian Physics (t/ = fr) but it is not in Special Relativity. A second 
example is the kinetic energy. Scalar means one component whereas G—invariant 
means one component and in addition this component must transform tensorially 
under the action of G. Furthermore a G— invariant is not necessarily a G’—invariant 
for G # G’. For example the Newtonian time is invariant under the Galileo 
group (i.e. tf = ft’) but not invariant under the Lorentz group (as we shall see 
’ = y(t — Bx/c)). 


A question we have to answer at this point is: 


Given a G— tensor how we can construct/define new G— tensors? 


The answer to this question is the following simple rule. 


‘Tn the Newtonian theory and the Theory of Special Relativity these frames/observers are the 
inertial frames/ observers. 


14 1 Mathematical Part 


Proposition 1.4.1 (Construction of G-—tensors) There are two methods to 
construct 
G-—tensors from given G—tensors: 


1. Differentiation of a G — tensor with a G — invariant 
2. Multiplication of a G — tensor with a G — invariant. 


1.4.1 Operations of Tensors 


The G—tensors in V” are linear geometric objects which can be combined with 
algebraic operations. 


Let T = oa and § = Cad be two G—tensors of order (r,s) and let 


R= re be a G—tensor of order (m,n) (All components refer to the same 
G-basis!). We define: 


1. Addition (subtraction) of tensors 
The sum (difference) T + S of the G—tensors T,S is defined to be the 
G-—tensor of type (r,s) whose components are the sum (difference) of the 
corresponding components of the tensors T = ees and S = ease 
2. Multiplication of tensors (tensor product) 
The tensor product T @ R is defined to be the G—tensor of type (r +m, s+n) 
whose components are the product of the components of the tensors T, R. 
3. Contraction of indices 
When in a G—tensor or order (r, s) we sum over a contravariant and a covariant 
index then we obtain a G—tensor of type (r — 1, s — 1). 


In this book we shall use the tensor operations mainly for vectors and tensors of 
second order, therefore we shall not pursue the study of these operations further. 

There is a final important point concerning tensors in a metric space. Indeed in 
such a space the metric tensor can be used to raise and lower an index as follows: 


A iji2...Up aon i2...1p 
Bait Ty jig = Nagi fais 
do ji pirate = Agi 12..-0r 

8 J1J2++-Js J2--Js 


This implies that in a metric space the contravariant and the covariant indices 
loose their character and become equivalent. Caution must be paid when we 
change the relative position of the indices in a tensor, because that may effect the 
symmetries of the tensor. However let us not worry about that for the moment. 

In Fig. 1.1 we show the role of the Euclidian and the Lorentz inner product on: 


. The definition of the subgroups E() and L(4) of GL(n, R). 

2. The definition of the ECF and the LCF in the set F(V”). 

3. The definition of the 7 E(n)—tensors and the TL(4)—tensors in TGL(n, R)— 
tensors on V”. 


ee 


1.5 The case of Euclidian Geometry 15 


GL{n,R) 


T(v") 


Fig. 1.1 The role of the Euclidian and the Lorentz inner products 


1.5 The case of Euclidian Geometry 


We consider the Euclidian space!® E* and the group gz of all coordinate trans- 
formations, which leave the canonical form of the Euclidian metric the same, 
that is gewy = SEuv = diag(1,1,1) = J. These coordinate transformations 
are the gg-canonical transformations, which in Sect. 1.4 we called Euclidian 
Orthogonal transformations (EOT) and the group they form is the Galileo group. 
The gg- coordinate systems are the Euclidian coordinate frames (ECF) mentioned 
in Sect. 1.3. In a ECF, and only in these coordinate frames, the expression of the 
Euclidian inner product and the Euclidian length are written as follows: 


u-v=uHy, =uly! + u?v? + wv? (1.18) 


u-u=u4u, = (u!)* + (W)? + w)? (1.19) 


where (u!, u, ur)! and (v}, v2, vt are the components of the vectors u, v in a ECF 
and ¢ indicates the transpose of a matrix. 

In order to compute the explicit form of the elements of the Euclidian group we 
consider the fundamental equation of isometry (1.15) and make use of the definition 
SEW = SEuw = diag(,1,1) = I of ECF We find immediately that the 
transforation matrix A satisfies the relation 


MAST (1.20) 


'6The following hold for a Euclidian space of any finite dimension. 


16 1 Mathematical Part 


Relation (1.20) means that the inverse of the matrices representing an EOT equals 
their transpose. In order to compute all EOT it is enough to solve the matrix equa- 
tion (1.20). This equation is solved as follows. We consider first the simple case of a 
two dimensional (n = 2) Euclidian space E 2 in which the dimension of the Euclid- 
ian group is three (2S), therefore we need three parameters to describe the gen- 
eral element of the isometry group. Two of them (n = 2) are used to describe trans- 


lations and the rest ar D_2= 20 D = 1 is used for the description of rotations. 
In order to compute the rotations we write: 
aja 
A= ( : *) (1.21) 
a3 a4 


where aj, a2, 43, a4 are functions of the (group) parameter @ (say). Replacing in 
(1.20) we find that the functions a1, a2, a3, a4 must satisfy the following relations: 


a+a=1 (1.22) 
aya3 + a2a4 = 0 (1.23) 
aj+az=l. (1.24) 


This is a system of four simultaneous equations in three unknowns therefore the 
solution is expressed in terms of a parameter, as expected. It is easy to show that a 
solution the system is: 


a, =a4=+cos¢?, a2 = —-a3=+sind (1.25) 


where ¢ is the group parameter. 

The determinant of the transformation aja4 — a2a3 = +1. This means that the 
Euclidian isometry group of E* has two subsets. The first defined by the value 
det A = +1 is called the proper Euclidian group of rotations and it is a subgroup 
of E(3). The other set defined by the value det A = —1 is not a group (because it 
does not contain the identity element). We conclude that the elements of the proper 
two dimensional rotational Euclidian group have the general form: 


he eae) (1.26) 


sing cosd 


In order to give the parameter # a geometric meaning we consider the space E” 
to be the plane x — y of the ECF & (xyz). Then the parameter ¢ is the angle of the 
(passive) clockwise or left hand rotation of the plane x — y around the z— axis. 

From the general expression of an EOT in two dimensions we can produce 
the corresponding expression in three dimensions as follows. First we note that 
the number of required parameters is 3G) = 6, three for translations along the 
coordinate axes and three for rotations. There are many sets of three parameters 
for the rotations, the most common being the Euler angles. The steps for the 


1.5 The case of Euclidian Geometry 


, 
Z,Z 


Fig. 1.2 Euler angles 


17 


computation of the general EOT in terms of these angles are the following (see 


also Fig. 1.2)!7: 


(i) Rotation of the x — y plane about the z—axis with angle ¢: 


cos@ sing O 
—sing cos¢d0 
0 0 1 


Let X, Y, z be the new coordinate axes. 
(ii) Rotation of the Y — z plane about the X axis with angle 6: 


1 O 0 
Ocos@ sind 
0—sin@ cosdé 


Let X, Y’, z’ be the new axes. 
(iii) Rotation of the X — Y’ plane about the z’ axis with angle w: 


cosw siny 0 
—sny cosy 0 
0 Oo 1 


Let x’, y’, z’ be the new axes. 


(1.27) 


(1.28) 


(1.29) 


'7For more details see “Classical Mechanics” H. Goldstein, C. Poole, J. Safko Third Edition (2002) 
Addison Wesley. We use the passive interpretation of rotations. See also “Rotational motion of rigid 
bodies” by Dmitry Garanin https://www.lehman.edu/faculty/dgaranin/Mechanics/Mechanis_of_ 


rigid_bodies.pdf for the active interpretation. 


18 1 Mathematical Part 


Multiplication of the three matrices gives the total EOT {x, y, z} > {x’, y’, z’}: 


cosycos@—sinycos@sing@ coswsing+sinycosé@cos®@ sin w sin 
—siny cos @ — cosy cos@ sing —sinw sing + cos y cos cos ¢ cos ¥ sind 
sin @ sing —sin@cos@ cos 0 


(1.30) 


The three angles 0, 6, y (Euler angles) are the three group parameters which 
describe the general rotation of a Euclidian isometry. We recall at this point that the 
full Euclidian isometry is given by the product of a general translation and a general 
rotation in any order, because the two actions commute. This result is the geometric 
explanation of the statement of Newtonian Mechanics that any motion of a rigid 
body can be described in terms of one rotation and one translation in any order. 


1.6 The Lorentz Geometry 


The theory of Special Relativity is developed on Minkowski space M*+. The 
Minkowski space is a flat linear space of dimension n = 4 endowed with the 
Lorentz inner product or metric. The term flat!® means that we can employ a unique 
coordinate system to cover all M* and map all M* in R*. The vectors of the space 
M‘ we call four-vectors and denote as u', v', ... where the index i = 0, 1, 2, 3. The 
component which corresponds to the value i = 0 shall be called temporal or zeroth 
component and the other three spatial components. The spacetime indices shall be 
denoted with small Latin letters i, j,k, a,b,c... and will be assumed to take the 
values 0, 1, 2, 3. The Greek indices will be used to indicate the spatial components 
and take the values 1, 2, 3. 

The group of g,—isometries of spacetime is the Poincare group. The elements 
of this group are linear transformations of M* which preserve the Lorentz inner 
product and consequently the lengths of the 4-vectors. The Lorentz metric is not 
positive definite and the Lorentz length u* = giijuiul of a four-vector can be 
positive, negative or zero. Because the length of a four-vector is invariant under 
a Lorentz isometry we can divide the 4-vectors in M* in three large and non- 
intersecting sets: 


Null four-vectors: uw =0 
Timelike four-vectors: u2 <0 
Spacelike four-vectors: u? > 0. 


'8The spacetime used in General Relativity has curvature and in general there does not exist a 
unique chart to cover all spacetime. 


1.6 The Lorentz Geometry 19 


Considering an arbitrary point O of spacetime as the origin we can describe any 
other point by its position vector wrt this origin. Applying the above classification 
of four-vectors we can divide M* in three large and non-intersecting parts. The 
first part consists of the points whose position vector wrt O is null. This is a 3- 
dimensional subspace (a hypersurface) in M+, which we call the null cone at O. 
The second (resp. third) part consists of the points inside (resp. outside) the null 
cone with a timelike (resp. spacelike) position vector wrt the selected origin O of 
M*. We note that the null cone is characteristic of the point O, which has been 
selected as the origin of M+. That is, at every point there exists a unique null cone 
associated with that point and different points have different null cones. Furthermore 
the null cone at a point consists of two cones with common apex. One of these cones 
we call the future light cone and the other the past light cone. 

Concerning the physical interpretation we associate the points on the null cone 
with light rays and the points inside the null come with events which describe the 
“motion” of massive particles and observers. We consider two classes of timelike 
and null four vectors, the future directed and the past directed, the former defined 
by points which are inside and on the future light cone and the latter defined by 
points which are inside or on the past light cone.!° 

Concerning the geometry of M* we have the following simple but important 
result.7° 


Proposition 1.6.1 The sum of future-directed timelike or future-directed timelike 
and future-directed null four-vectors is a future-directed timelike four-vector except 
if, and only if, all four-vectors are future-directed null and parallel in which case 
the sum is a future-directed null four-vector parallel to the null four-vectors. 


This result is important because allows us to study reactions of elementary 
particles including photons. Indeed as we shall see the elementary particles are 
characterized with their 4-momenta, which is null for photons and timelike for 
the rest of the particles with mass. Then Proposition 1.6.1 says that from the 
interaction of particles and photons result again particles and photons and, in the 
case of a light beam consisting only of parallel photons this stays a light beam as 
it propagates. As we shall see the existence of light beams in Minkowski space is 
vital to Special Relativity because they are used for the measurement of the position 
vector (=coordinazation) in spacetime. 


1.6.1 Lorentz Transformations 


The Poincaré group consists of all transformations of M*, which satisfy the matrix 
equation [f~!]'[n][f~!] = [n] where [n] = diag(—1,1, 1,1) is the canonical 


19 A precise definition of future and past timelike and null vectors will be given below. 
0For a proof see Theorem 17.4.1 Sect. 17.4. 


20 1 Mathematical Part 


form of the Lorentz metric. The dimension of the Poincaré group is a) = 10 
therefore an arbitrary element of the group is described in terms of 10 parameters. 
Four of these parameters (7 = 4) concern the closed (Abelian) subgroup of 
translations and the other six the subgroup of rotations. This later subgroup is 
called the Lorentz group and the resulting coordinate transformations the Lorentz 
transformations. As was the case with the Euclidian group every element of the 
Poincaré group is decomposed as the product of a translation and a Lorentz 
transformation (rotation) about a characteristic direction. Therefore in order to 
compute the general Poincaré transformation it is enough to compute the Lorentz 
transformations (rotations) defined by the matrix equation: 


[L)'[n][L] = [nl. C31) 


In Sect. 1.7 we solve this equation directly using formal algebra and compute all 
Lorentz transformations. However this solution lacks the geometric insight and does 
not make clear their relation with the Euclidian transformations. Therefore at this 
point we work differently and compute the Lorentz transformations in the same way 
we did for the Euclidian rotations. For that reason we shall use the Euler angles (see 
Sect. 1.5) and in addition three more spacetime rotations, that is, rotations of two 
dimensional planes (/, x), (/, y), (J, z) (to be defined below) about the spatial axes. 
These planes have a Lorentz metric and we call them hyperbolic planes. Let us see 
how the method works. 

We consider first a 3 x 3 EOT, E say, in three dimensional Euclidian spatial space 
and the 4 x 4 block matrix”!: 


10 
R(E) = ‘ 1.32 
(E) (j ) (1.32) 

The matrix E satisfies E‘E = 1; and it is described in terms of three parameters 
(e.g. the Euler angles) therefore the same holds for the matrix R. It is easy to show 
that the matrix R satisfies the equation: 


R(E)'nR(E) = 


therefore it is a Lorentz transformation. Furthermore one can show that the set of 
all matrices R of the form (1.32) is a closed subgroup of the group of Lorentz 
transformations. This means that the general Lorentz transformation can be written 
as the product of two matrices as follows: 


L(B, E) = R(E)L(B) (1.33) 


214 block matrix is a matrix whose elements are matrices. With block matrices we can perform 
all matrix operations provided the element matrices are of a suitable dimension. Here the element 
(1, 2) isa 1 x 3 matrix and the element (2, 1) a3 x 1 matrix. 


1.6 The Lorentz Geometry 21 


where L(f) is a transformation we must find and R(£) is a Lorentz transformation 
of the form (1.32). The vector B(= 6”) involves three independent parameters and 
the symbol E refers to the three parameters of the Euclidian rotations (e.g. the Euler 
angles). We demand that the general Lorentz transformation L(8, F) satisfies the 
defining equation (1.31) and get: 


L'R(E)'nR(E)L =n => L'nL =n (1.34) 


which implies that L(B) is also a Lorentz transformation. 

The Lorentz transformation L(f) contains the non-Euclidian part of the general 
Lorentz transformation, therefore it contains all spacetime rotations, that is, rota- 
tions which involve in some way the zeroth component /. There can be two types of 
such transformations: Euclidian rotations about the / axis and rotation of one of the 
planes (/, x), (, y), (/, z) about the about the spatial axes normal to these planes. 
This later type of Lorentz transformations we call boosts. The Euclidian rotation we 
have already computed in Sect. 1.5. Therefore it remains to compute the rotation of 
the hyperbolic planes. 

We consider the linear transformation (boost): 


l’=al+bx 
x’=cl+dx 


which defines the transformation matrix: 


ab 
Lon = (44). 


The parameter w is a real parameter whose geometric significance has to be 
determined. We demand the matrix L(y) to satisfy the equation of isometry (1.34) 
and find the following conditions on the coefficients a, b, c, d: 


a—c=l 
d—p=1 
ab=cd. 


This system of simultaneous equations has two solutions: 


a=dz=coshw (1.35) 
c=b=sinhy 
and 
a=-—d=coshy (1.36) 


c=-—b=sinhy 


22 1 Mathematical Part 


It follows that geometrically y is the hyperbolic angle of rotation in the plane (J, x). 
We call w the rapidity of the boost. The solution (1.35) has det L(y) = +1 and 
leads to the subgroup of proper Lorentz transformations. The solution (1.36) 
has det L(y) = —1 and leads to the improper Lorentz transformations which do 
not form a group. Because we have also two types of Euclidian transformations 
eventually we have four classes of general Lorentz transformations. We choose the 
solution (1.35) and write: 
cosh y sinh y 
BE) =: ( sinh yw ne oe 


It is easy to prove the relation: 
LWW)L(-yp) =1 
that is the inverse of L(y) is found if we replace yw with —y in which case we get 


LW = ( cosh y cna 


—sinhw coshy 


We introduce the new parameter 6 (|8| < 1) with the relation: 


cosh =y = £2 Ct, \. (1.38) 


1 
Je 


We compute sinh yy = +fy and finally we have for a boost in the (/, x) plane: 


L(B) = i a) (1.39) 


We have also the obvious relation L~! (B) = L(-8). 

We continue with the computation of the general Lorentz transformation. We 
assume that the three parameters (6,, By, Bz) (|Bul < 1) which concern the 
general Lorentz transformation L(B) define in the three dimensional spatial space 
the direction cosines of a characteristic direction specified uniquely by the LCFs 
(l,x, y,z), (l’, x’, y’,z'). The rotations involved in L(B) must be rotations of 
hyperbolic planes about the other spatial axes, therefore there is no room for 
Euclidian rotations. This implies that we must consider the axes (xyz), (x y’z’) of 
the two LCF as being “parallel” . This parallelism is of a Euclidian nature; therefore 
it is not Lorentz invariant and must be defined. We do this below. 

Without restricting the generality of our considerations, and in order to have 
the possibility of visual representation, we suppress one dimension and consider 
the LCF (J, x, y), (l’, x’, y’). In order to calculate L(B) we shall use the angles 
of rotation of the Euclidian case with the difference that they will be treated as 
hyperbolic instead of Euclidian. Following the discussion of Sect. 1.5 we have the 
three rotations (see Fig. 1.3). 


1.6 The Lorentz Geometry 23 


Fig. 1.3. The Lorentz transformation 


(Al) Rotation of the Euclidian plane (x, y) about the / axis with Euclidian angle 
@ so that the new coordinate Y will be parallel to the characteristic spatial 
direction defined by the direction cosines 6,, By. The transformation is: 


(1, x,y) > (1, X, Y) 


and has matrix: 


1 O 0 
Aj={|0 cos@ sing ]. (1.40) 
0 —sing cosd 


(L1) Rotation of the hyperbolic plane (/, y) about the X axis with hyperbolic angle 
w, which is fixed by the parameter 8 = ,/B2 + BS via the relations cosh y = 
y, sinh y = By. The transformation is: 


(, X,Y) > , X,Y’) 
and has matrix: 
y —Bby 0 


Li(B)=| —-By y O}]. (1.41) 
0 oO 1 


24 1 Mathematical Part 


(A2) Rotation of the Euclidian plane (X, Y’) about the /’ axis with Euclidian angle 
—@ in order to reverse the initial rotation of the spatial axes and make them 
(by definition!) “parallel”. The transformation is: 


(2 Oe ate 


and has matrix: 


1 0 0 
A2= | Ocos@ —sing |. (1.42) 
0 sing cos@ 


We note that Aj Az = J. This relation defines the parallelism of the spatial axes 
(“Euclidian parallelism”) in the geometry of the Minkowski space M+. We note 
that in M* the initial and the final axes do not coincide and are not parallel in the 
Euclidian sense, as someone might have expected. Indeed the rotation —@ takes 
place in the plane (X’, Y), which is normal to /’ whereas the rotation ¢ in the plane 
(x, y), which is normal to the / axis. Therefore the axes (x’, y’) (in the space M*) 
are on a different plane from the initial axes (x, y) . 

Obviously we must expect a different and “strange” behavior of the Euclidian 
“parallelism” in the geometry of M*. For example if the spatial axes of the LCF 
(i,x, y) and (l', x’, y’) are parallel and the same holds for the LCF (i', x’, y’) 
and (1, x”, y”) then the spatial axes of the LCF (/,.x, y) and (l,x”, y”) are 
not in general parallel in the Euclidian sense. We shall discuss the consequences 
of the Euclidian parallelism in Minkowski space when we consider the Thomas 
precession. 

The general Lorentz transformation L(B) is the combination of these three 
rotations in the same way the general Euclidian transformation is derived from the 
composition of the three Euler rotations. That is we have?*: 


L(B) = A2()L1(B) Ai). (1.43) 


Replacing the matrices from relations (1.40), (1.41), and (1.42) and writing 6, = 
Bcos@, By = B sing we compute easily the following result: 


Y —YBx —yBy 


L(8) = | -vBx 1+ Pe? Ppp, |. (1.44) 


-1 -1 
—yBy "FPrBy 1+ By 


Note that A2(~) = A1(—¢)! 


1.6 The Lorentz Geometry 25 


This matrix can be written in a more compact form which holds generally, that is 
including the ignored coordinate z, as follows: 


Y —VBu 
L(B) = ng ed , (1.45) 
—ypY by + “ar BE By 


The transformation (1.45) holds only when the axes of the initial and the final 
LCF are parallel (in the Euclidian sense) and have the same orientation (that is 
both left handed or both right-handed). According to equation (1.43) the most 
general Lorentz transformation is given by the product of this transformation and 
a transformation of the form (1.32) defined by a Euclidian rotation. 


Exercise 1.6.1 Multiply the matrices in (1.43) and show that the matrix (1.44) and 
the matrix (1.45) satisfy the isometry equation L(B)'nL(B) = n. Compute the 
determinant of this transformation and show that it equals +1. Conclude that the 
transformation (1.45) describes indeed the general proper Lorentz transformation. 


It can be shown”? that the general Lorentz transformation (not only the proper) 
has the form: 


D (detL) Dp’ 
L(p) = (1.46) 
—DB (detL) [5 + 75" BB") 
where D = +y. The transformation (1.66) has four families depending on the sign 
of the (00) element and the detL = +1. In order to distinguish the four cases we 


use for the sign of the term 00 the arrows t / | for +y and —y respectively. 
Concerning the sign of the determinant we use one further index +. As a result 
of these conventions we have the following four families of Lorentz transforma- 


tions2*: 


1. Proper Lorentz transformations (detL = 1). 


y —-yvB 
L44(B) = ; (1.47) 
—yB 1+ pp" 


23 See Sect. 15.2 for the derivation of the general forms of Lorentz transformation. 


4Note the useful identity 1 + B?y? = y?*. Therefore it is possible to write the Lorentz 
transformation in terms of the parameter y only. 


26 1 Mathematical Part 


2. Space inversion Lorentz transformations (detL = —1). 
vy vB 
L_+(B) = - (1.48) 
yp —1- pp 
3. Time inversion Lorentz transformations (detL = —1). 
=y vf 
L_\(B) = (1.49) 


1 
yB —1+ 1 pp" 
4. Space-time inversion Lorentz transformations (detL = +1). 


-y yB' 
L+\(B) = ; (1.50) 
yB 1—*S pp" 


Only the proper Lorentz transformations form a group (a closed subgroup of the 
Lorentz group). All four types of Lorentz transformations are important in Physics, 
but in the present book (and in most applications) we keep only the proper Lorentz 
transformations. 

In order to express the Lorentz transformations in terms of components we 


0 
consider a four-vector A’ which in the LCF © and &’ has components G ) ; 
x 


0’ 
( ) which are related by the proper Lorentz transformation L+»(B) relating 
y’ 


x )7 (4) 
; = L+4(B) 
( A’) yy a Ass 


Replacing L4(8) from (1.47) we find: 


yD’: 


A” = y(A°— B-A) (1.51) 


=i 
= (B- AB — y AB. (1.52) 


A’=A+ 


A special type of proper Lorentz transformations are the boosts, defined by the 
requirement that two of the three direction cosines vanish. In this case we say that the 
LCF © and ©’ are moving in the standard configuration along the axis specified 


1.6 The Lorentz Geometry 27 


by the remaining direction cosine. For example if By = B, = 0 we have the boost 
along the x—axis with factor 6 and the boost is: 


AY = y(A° — BA*) 


A® = y(A* — BA) (1.53) 
AY — AY 
Ae = AX, 


Exercise 1.6.2 Show that the proper Lorentz transformation for the position four- 
vector (; ) is: 
r 
l'=y(l-B-r) (1.54) 


—1 
art (BB vib. (1.55) 


In the special case of the boost along the x—axis show that the transformation of 
the position four-vector is: 


= y(— 6x) 

x’ = y(x — Bl) (1.56) 
yey 

g= 2; 


Example 1.6.1 Prove that the proper Lorentz transformation does not change the 
sign of the zeroth component of a timelike four-vector. 
Solution 

Let A! be a timelike four-vector which in the LCF © has components A’ = 
(1, r)‘, and in its proper frame + has components A’ = (/*, 0)'. From the proper 
Lorentz transformation we have for the zeroth coordinate 1 = yl*, which proves 
that /, /* have the same sign. 


The result of Example 1.6.1 allows us to consider two classes of timelike and the 
null four vectors at every spacetime point in covariant manner, according to the sign 
of their zeroth component. That is the future directed timelike/null four vectors 
which have / > 0 are directed in the “upper part” (the future) of the null cone — and 
the past directed — timelike/null four vectors which have / < 0 and are directed in 
the “lower part” (the past) of the null cone. 


28 1 Mathematical Part 


1.7 Algebraic Determination of the Lorentz Transformation 


L(B) 


It is generally believed that the determination of the analytical form of the Lorentz 
transformation in a LCF requires the use of Special Relativity. This is wrong. 
Lorentz transformations are the solutions of the mathematical equation n = L'nL 
where 7 is the 4 x 4 matrix diag(—1, 1, 1, 1) and L is a matrix to be determined. For 
that reason in this section we solve this equation using pure algebra and produce the 
so called vector Lorentz transformation. In a subsequent chapter (see Chap. 15), 
when the reader will be more experienced, we shall derive the covariant form of the 
Lorentz transformation. 

In order to solve equation (1.31) we consider an arbitrary LCF and write L as the 
block matrix: 


_ Ei c (1.57) 


where the submatrices A, B, C, D are as follows: 


D: 1x 1 (a function but not necessarily an invariant!) 
B:3x1 
C:1x3 
A:3x3. 


Equation (1.31) is written as the following matrix equation: 


few Eon Mea JLo | 


Multiplying the block matrices we find the matrix equations: 


AA-C'C=h (1.58) 
B'A—DC=0 (1.59) 
PRD? Se] (1.60) 


whose solution determines the explicit form of the Lorentz transformation. Before 
we attempt the general solution we look at two special solutions of physical 
importance. 


Casel. C=0,A40 


In this case equation (1.58) implies A’A = J3 therefore A is a Euclidian matrix. 
Then equation (1.59) implies B = 0 and equation (1.60) D = +1. Therefore we 
have the two special Lorentz transformations: 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 29 


R,(E) = i, : , R(E)= Ee : IF (1.61) 


It follows that the EOT’s are included in a natural manner in the Lorentz 
transformations. We note the relations: 


detR, (EZ) = +1 detR_(E) = —-1 (1.62) 
and 
RR =, REUYR-@y= ly, RLG)R(E) =n. (1.63) 


Case2. A=diag(K,1,1) (|K|> 1). 
Let B’ = (By, Bz, B3), C = (Cy, C2, C3). Then equation (1.58) gives: 


diag(K”, 1, 1) — diag(C?, C2, C2) = diag(1, 1, ) 


from which follows: 


Cj =+VK?2-1, ©=C3=0. 


Equation (1.59) gives: 


(KB, Bo, B3) = (Dy K? od 1,0, 0) 


from which follows: 


D 
Bi =+75vK*—1, By = B3 = 0. 


Finally relation (1.60) implies: 


—(K*-1)=D?-15 D=4|K|. 


We conclude that in Case 2 we have the solution (|K| > 1): 


A = diag(K, 1,1) C= K?=1,0,0) 
B= (4VK2-—1,0,0)!) D=4+I/K| (1.64) 


30 1 Mathematical Part 


which defines the following eight Lorentz transformations (C} = VK? — 1) for 
each value of | K|: 


KC. 0-0 —K -C, 00 
1,-|CrK 99] p_]| ak 00 
00 10 00 10 
00 01 00 O1 
K G00 =k 26) 0-6 
=O, K 0:0 =; KF 0.0 
13 6 9 101° =) 9 0 10 
oo O20 0 0 O01 


The special solutions L+ are called boosts. We note the relations: 


detL;=+1, detLy=—-1, detL3=2K*—1, detL4=-—1. 
For K = 1 we have the special solution: 
Llj=1, Lo=n, L3=-ly4, La=n 


that is, the solutions of Case 1. [Question: What is the significance of this result? 
How can it be used in Geometry and Physics?] 


General solution 
We define the 3 x 1 matrix 6 with the relation: 


B=-—Df. 


Replacing in equation (1.60) we find: 


Dp 1=+,/1- 6? 


where 7 = f'f is a scalar and we assume 0 < f* < 1 in order that D € R. 
Equation (1.59) implies: 


—Dp'A—-DC=05C=-F'A. 
Replacing in equation (1.58) we find”>: 
A'A—A'BB'A=1, > A'(R- BBDA=H]h 
A'(s — BB’) = Av! = AA'(s — Bf") = b. 


25Note that BB’ is a symmetric 3 x 3 matrix. 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 31 


To continue we need the inverse of the matrix (J — Bf’). We claim that (J — 
BB')~' = 1 + D? BB". Indeed: 


(I — BB‘) + D° BB") = I — BB’ + D*BB' — D* BB" BB" 
=1+(-1+D° — D*B’) pp" 
=1+[-1+ D°( - 6?)| 6B" = 1 
Therefore AA’ = J + D7". In order to compute the matrix A we note that: 


D—- 
Ga BP a) Ste P “#6 + ( P 7 (a8') 
D—-1)\(D+1 
- nee! = 1+ CoP te 


Bp" 


Dr=d t 2 apt 
= 1+ —,— Bp = 1+ D° Bp". 


Using the fact that BB’ is a symmetric matrix one can show easily that: 


D-1_., D-1,.,\' 
rp BB =(n+ Be pe) . 


t+ 
This implies: 
t 2apt DrA t Dot t 
AAt = I+ D6p' = (1+ 25 * pp") (1+ 25 * pe") 
= (1+ Steet) (14 ae) = 
D-1_, 
A=x(I+ BR pe') 


where the 3 x3 matrix E is a EOT, that is, satisfies the property E’ E = 13. Replacing 
in C = —f'A we find for the matrix C: 


Cane (1+ 


We conclude that the general Lorentz transformation is: 


‘ E = F(6' + (D—-1)f') E = =D E = FDB". 


L(B, E) = L(B)R(E) (1.65) 


32 1 Mathematical Part 


where the matrix R(£) is one of the solutions of Case | (Euclidian solution) and the 
matrix L(B) (Relativistic solution) is defined by the block matrix: 


zy +yB' 
L(B) = fs : (1.66) 
FyB +15) + S89) BB") 


(1.67) 


DgB' 
Le) =| md 


—DBp + ( + D1 6p") 


where D=+y, y = 1//1-— ?. 

There result four different vector Lorentz transformations L(B) which are 
defined by the signs of the terms with D and the term J + or BB . If we take 
into consideration the rotation matrices Ri(E) we have in total 8 cases of L(B, E). 
In the following with the term Lorentz transformation we shall mean the vector 
Lorentz transformation which is given by the matrix L(8) and it is the relativistic 
part of the general transformation L(8, E). The role of the Euclidian part R(E) 
we shall discuss in Sect. 1.8.1. From (1.67) we find four disjoint subsets of Lorentz 
transformations classified by the sign of the determinant of the transformation and 
the sign of the components”° with D. 

Based on the above results we have the following four classes of Lorentz 
transformations: 


a. Proper Lorentz transformation (D = y): 


Li4(B) = iw 14 Fal (1.68) 


b. Lorentz transformation with space inversion (D = y): 


L_y(B) = = ane 12! gp (1.69) 


c. Lorentz transformation with time inversion (D = —y): 


L_y(B) = i : "slp (1.70) 


©The first parameter has to do with the orientability of M* and the second with the preservation 
of the sense of direction of the timelike curves. 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 33 


d. Lorentz transformation with spacetime inversion (D = —y): 
aye -S ye 
L — 1.71 
HD =| yp 1 — 8 pg (1.71) 


All four forms of the Lorentz transformation are useful in the study of physical 
problems. But as a rule we use the proper Lorentz transformations because they 
form a (continuous) group of transformations. A closed subgroup in this group are 
the boosts along a direction which are the proper Lorentz transformations defined 
by one of the 8B, = (1, 0, 0), B, = (0, 1, 0), B, = (0, 0, 1)). For example the boosts 
along the x—axis and along the y—axis respectively are: 


y -vyB 00 y 0 -yB 0 
-yp y 00 01 0 0 

L => L , = 
+t,x(B) 0 010 +t,y(B) -yp0 y 0 
0 oO 01 0-0. @ I 


Example 1.7.1 Compute det L(B) and comment on the result. 
Solution 
From the definition of L(B we have 


det(L' nL) = det 7. 


Because the determinant of the product of matrices equals the product of the 
determinants we have that det L = +1. 

The det L = +1 corresponds to the case of the proper Lorentz transformation and 
the spacetime inversion and the case det L = —1 corresponds to the cases of time 
inversion and space inversion. In the first two cases the orientation of the spacetime 
frame is preserved whereas in the second it is reversed. 


In order to write the proper Lorentz transformation as a coordinate transformation 
in M+ we consider an arbitrary four vector and write?’: 


2) =1)(‘). (1.73) 
r r 


?7The Euclidian part of the transformation E is ignored because we assume that it defines the 
relative orientation of the axes of the two coordinate frames related by the Lorentz transformation 
and does not effect the transformation of the four-vectors (and more generally tensors) in M‘. 
Moreover in Sect. 1.8.1 we shall define that the space axes of two LCF are parallel if E = J3. In 
other words the Lorentz transformation L(B) relates the coordinates of two LCF whose spatial 
axes are parallel. This has to be kept always in mind. 


34 1 Mathematical Part 


Then we find the following “vector expressions” of the Lorentz transformation: 


a. Proper Lorentz transformation: 


, y-1 


r=r+ 


(B -r)B — y/B, '=y(l—B-r). (1.74) 


p2 
b. Lorentz transformation with space inversion: 


| 
= (B-nB—yip, =yd+B-n). (1.75) 


rf =-r— 


c. Lorentz transformation with time inversion: 


1 
a (B-np+yip,  l'=-yU—-B-n). (1.76) 


rf =r— 


d. Lorentz transformation with spacetime inversion: 


r=—r+ V (8-8 + IB, l’=-y(+B-r). (1.77) 


In the following example we give a simpler version of the derivation of Lorentz 
transformations using simple algebra. 


Example 1.7.2. Consider in space R* the linear transformations: 
R* — R* 
CX Ve) PUK eo) 


which: 


(i) Are defined by the transformation equations: 


V=yx+ypytyz+ Bal 


x= a;x + Pil 
y’ =any + Bol 
z =a3z+ Bl 


where the 10 parameters a;, 6;, yj (i = 1, 2,3) are such that a; 4 0, B4 4 0 
and at least one of the y; 4 0. 
(ii) Satisfy the relation: 
x2 + y? + z2 _ 1? = x2 + ye + 22 — PZ (1.78) 


that is, leave the Lorentz length (in a LCF!) invariant. 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 35 


(a) Show that the transformations defined by requirements (1), (ii) are 32 and 
can be classified by means of one parameter only. Write these transforma- 
tions in terms of a general expression. 

(b) Compute the determinant of the general transformation and show that 16 of 
them have determinant equal to +1 and the rest 16 have determinant equal 
to —1. 

(c) Demand further that when / = I' = 0, then x = x’, y = y’,z = 2’ and 
show that with this requirement only 2 transformations survive. 

(d) Give a kinematic interpretation of these two transformations. 


Solution 
From the transformation equations we have: 


oP +O" + 0/7? - 0 

= (ax + Bil)? + (oy + Bal)” + (a3z + Bal)” — (ix + yy + 732 + Bal)? 
= (af — yp)x? + (a3 — Ey" + @§ — ys )z? + (BT + BS + BS - BDI? 

+ 2x1(a1 Bi — 71 Ba) + 2yl (282 — y2B4) + 221(a3B3 — y3Ba) 

— 2yviyaxy — 2y2y3yz — 2y3yvizx. (1.79) 


Comparison of equations (1.78) and (1.79) implies the relations (@ = 1, 2, 3): 


a? —y?=1 (1.80) 

V2 = 7273 = ¥3"1 = 0 (1.81) 
Bi + By + Bs — By =—1 (1.82) 
a; Bi — Vi Ba = 0. (1.83) 


Relations (1.80), (1.81), (1.82), and (1.83) constitute a system of nine simultane- 
ous equations which can be used to express nine parameters in terms of one. Indeed 
equation (1.81) implies that two of the y; are equal to zero. Without restricting the 
generality we assume y) 4 0 > y2 = y3 = 0. Then equation (1.80) gives: 


a =+/1+y7 (1.84) 


a2 = +1, a3=+1. (1.85) 


Then from equations (1.82) and (1.83) follows: 


Bi = —B4 (1.86) 
Bo = £3 = (1.87) 


Ba = 4,/14+ 7. (1.88) 


36 1 Mathematical Part 


We note that all coefficients of the transformation (1.78) have been expressed in 
terms of the parameter y;. The general form of the transformation is: 


[: bo yw 0 O 1 i 
! vib 

J = ot 0 O x] _ ee x 
y 0 0 a 0 y y 
a 0 O00 a Z z 


where the quantities w;, 64 are defined in terms of the parameter y; via relations 
(1.84) and (1.88) respectively. It follows that we have in total 2 = 32 possible 
one parameter transformations, L(y,) say, which are specified by the different 
combinations of signs of the components of the transformation. Because the 
parameter y; appears as 1 + ve we introduce a new parameter y with the relation: 


1 
oe a a ae (-l<B<1, yvy=)N (1.89) 


and have: 


y= +yf. (1.90) 


Then the expression of the various parameters in terms of the new parameter y is: 


aj=ary, a= +1, a3 = +1, Ba =ry. 
The 32 transformations are written in terms of the parameter f (or y): 


I’ = eyBx + e2yl 


x’ = e3yx + epere3Byl (1.91) 
y= eay 
Z = 852 
where the quantities e; = +1, (i = 1,2,3,4,5) are defined by the relations: 
Oy = €3Y, 2 = €4, 03 = €5, Ba = ery, 1 = E1/ 8. (1.92) 


(b) The determinant of the matrix L(B) equals: 


£0y e1yB 0 0 
ELEDE € 0 0 
detL(B) = - 263By — nO = €2€3E4€5. (1.93) 


0 0 0 65 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 37 


The possible values of the determinant are +1. The requirement detL(6) = +1 is 
equivalent to the condition: 


E2EZE4E5 = | (1.94) 


which gives 24 = 16 cases. Therefore the condition that the determinant equals 
+1 or —1 selects 16 cases respectively. We select the first set because it contains 
the identity transformation (whose determinant equals +1). The transformations 
in this set form a group and their general form is given by the transformation 
equations (1.91). 


(c) We consider next the transformations which in addition to the condition detL = 
+1 satisfy the condition that when /’ =/=Othenx =x'=0,y=y,z=2' 
and also preserve the orientation of the space axes. The new condition implies 
the equations: 


yay, 2=&5Z (1.95) 
which are satisfied by the following values of the parameters: 
€4=e5=1. 
The last equation and equation (1.94) imply: 


€3€2 = ik. (1.96) 


which implies that e2 = ¢3 = +1. The requirement of the orientation of the space 
axes requires €4€4€5 = 1, therefore we have eventually e2 = ¢3 = 1. 

There remains only one parameter free, the ¢;, therefore there are two families 
of single-parametric Lorentz transformations with det L(B) = +1. The transfor- 
mations in these families constitute a group and are called boosts with parameter 


B (€, = —1) and—£f (e = 1) respectively. The transformation equations for 
each case are as follows: 
gp = 1 
I'= y(l+ Bx) 
x’ = y(x + Bl) (1.97) 
yoy, v=z 
gp = —1 
I' = y(l— Bx) 
x’ = y(x — Bl) (1.98) 


’ ! 
Yuyf =%. 


38 1 Mathematical Part 


In the following Table we summarize the results concerning the number of 
free parameters and the number of the corresponding transformations for each 
requirement.*® 


Requirements Free parameter Possible transformations 
Conditions (1.78 and 1.79) 5 32 

det L(B) = +1 or—-1 4 16 

Orientation of space axes 2 8 

l=l'=0,x =x'=0,y=y,z=27 1 2: 


(d) The geometric interpretation of the transformation we shall give in Sect. 1.8.1. 
Concerning the kinematic interpretation (that is interpretation involving the 
time and the space) of the transformation we note the following: 


¢ The transformation is single-parametric, therefore it must be related with one 
scalar kinematic quantity only. 

¢« The transformation is symmetric in the sense that if the coordinates 
(l', x’, y’, z') are expressed in terms of (/, x, y, z) with 6 then the (J, x, y, z) 
are expressed in terms of (/', x’, y’, z’) with —B (prove this). 

¢ The transformation must satisfy the initial condition when / = /' = 0 then 
x=x',y=y',z=2z (orr=r’). 


The above results lead to the following kinematic interpretation of the transfor- 
mation. The coordinates /, /’ concern the time and the coordinates x, x’, y, y’,z, z’ 
the orthonormal (in the Euclidian sense!) spatial axes of two Relativistic Inertial 
Observers. At the time moment / = /’ = 0 the spatial axes of the observers coincide 
and subsequently they are moving so that the plane x — z is parallel transported with 
respect to the plane x’ — z’ (because during the motion y = y’) and similarly the 
plane x — y is parallel transported with respect to the plane x’ — y’(because during 
the motion z = 2’). 

We conclude that there are only two motions possible: 


* One motion in which the x’ axis slides along the x axis in the direction x > 0 
* One motion in which the axis x’ slides along the axis x in the direction x < 0. 


We consider that the first type of motion corresponds to the values | > B > 0 
while the second to the values 0 > 6 > —1. The parameter f we identify with the 
quotient v/c where |v| < c (because f < 1) is the speed of the relative velocity of 
the parallel axes x, y,z and x’, y’,z’ and c is a universal constant, which is the 
maximum velocity with which one frame can be boosted relative to another. In 
Physics we identify c with the speed (not velocity!) of light in vacuum (see Fig. 1.4). 


?8We note that a pure Lorentz transformation (i.e. Euclidian rotations and translations excluded) 
depends on six parameters. Here we are left with one parameter due to the simplified form of the 
transformation. 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 39 


Fig. 1.4 Kinematic interpretation of Lorentz transformation 


Exercise 1.7.1 Prove the identities”? : 


y=y'pr +i (1.99) 

y ( Ee ) = y(Bi)y (Bo) (1 + Bi Bo) (1.100) 
1+ Bi p> . : 

dy =y*BdB,  — d(yB) = y°dB (1.101) 

=i | 2, 3 ga 1.102 

y= + 5B + 38 +... d. ) 


Exercise 1.7.2. Consider the matrix: 


y -yvB 00 

i - 00 
Licgy =| ~YB OY 

i) 0 0 10 

0 oO 01 


where y = 1/y/1— B, B €[0, 1). 
(a) Compute the inverse, L',(B). 
(b) Prove that Li (B) is a Lorentz transformation. 


(c) Define the hyperbolic angle ¢ with the relation cosh ¢ = y and prove that the 
boost along the x—axis is written as: 


cosh¢@ —sinh¢@ 0 0 
—sinh¢@ cosh¢@ 0 0 
0 0 10 
0 0 01 


Li (p) = 


?°Tdentity (1.99) is used extensively in the calculations. Identity (1.100) expresses the relativistic 
composition of 3-velocities under successive boosts. 


40 1 Mathematical Part 


where sinh = By. Also show that tanh = B and e? = ae. The parameter @ 


is called the rapidity of the transformation. 
[Hint: (a) The inverse is: 


(b) It is enough to show that nj; = Nrs Lj Li or in the form of matrices n = L'nL 
where n = diag(—1, 1, 1, 1).] 


Example 1.7.3 Consider the LCF {K’, xi E;’}, {K, x', E;} which are related with 
the transformation: 


/ . 

a sinh 6x? + cosh ox! 
/ / 

2 a ae es 
/ . 

gS sinh px! + cosh ox? 


where ¢ is a real parameter. 


(a) Prove that the transformation K — K’ is a Lorentz transformation. 

(b) A four-vector V’ in K has components (0, 1, 0, 1)‘. Compute the components 
of V! in K’. A Lorentz tensor T;; of type (0, 2) in K" has all its components 
equal to zero except the 77, = T33, = 1. Compute the components of 7;; in 
K. Compute in K the covariant vector 7;; V‘ and the invariant 7;;V'V/. 


Solution 


(a) The matrix transformation between K, K’ is: 


cosh¢@ —sinh¢@ 0 0 

j —sinh@ cosh@ 0 0 

L'. (Bp) = 
jP) 0 0 0 

1 


1 
0 0 0 


It is easy to show that this is a Lorentz transformation (see exercise 1.7.2). 
(b) For the four-vector V' we have: 


1.7 Algebraic Determination of the Lorentz Transformation L(B) 41 
Hence: 
Vo =D via L¥ y+ 19 Vv! 419 v2 +19 V3 = sinh 
Vi gL via tivo +z vi 41) v241! v3 = cosho 
VaPViaPV 4 vi4 22 V24+12 V3 =0 
VaBVaBV4BVI43V24 V3 =1. 
Therefore [V‘] = (— sinh ¢, cosh, 0, 1)’. 


Similarly for the tensor 7;; we have: 


ax! ax/ i! G. 1 1 3 3 
Tjj = rir at Ty yp = LE LG Ty = Lj Ly Ty + Lj LG Tyy 


=L} Li +1} 1} 


where we have used the fact that in K’ only the components 7, ,’, T;/3/ do not vanish 


and are equal to 1. From this relation we compute the components of 7;; in K using 
the standard method. For example for the Zo) component we have: 


To = L}L) + L312 = sinh? ¢. 


In form of a matrix the components of 7;; in K are: 


sinh? @ —sinh¢@cosh¢ 0 0 
[T;] = —sinh¢@cosh¢ cosh? @ 00 F 
? 0 0 00 |" 
0 0 01 


In K we have for the vector a; = Tj; Vi: 


ay = To V° + To V! + To2V* + T93V° = —coshd sinhd 
ay = TV° + T1V! + T12V* + T13V> = cosh? b 

a2 = Tx) V° + TV! + TV? + T3V? = 0 

a3 = T39V° + T3,V! + T32V7 + T33V? = 1. 


Finally for the invariant a; Vi=T; j V'V/ we have (in all LCF frames!): 


qj vi= ayV° + aV! + anV" + a3V° — cosh’ +1. 


42 1 Mathematical Part 


1/y sf 1 
I 
i) 
1.0 1 
5.0 ! 
i) 
0.8 4.0 
i) 
I 
06 3.0 ! 
2.0 
0.4 1 
1.0 I 
i) 
0.2 
: > 
0 025 0.50 0.75 1.0 B 0 02 O04 O06 O8 1.0 B 


Fig. 1.5 The factors y and : 


Exercise 1.7.3 Derive the results of Example 1.7.3 using matrix multiplication as 
follows: 


(Vk ILIV x, (Tq = LET (24, fal =(T VV x), a! = fal iV x. 


L 
2 


Example 1.7.4 Compute the values of the functions y = (1 — py? and y~! for 
the values B = 0.100, 0.300, 0.600, 0.800, 0.900, 0.950, 0.990. Plot the results. 


Solution 
B 0.100 0.300 0.600 0.800 0.900 0.950 0.990 
l/y 0.995 0.954 0.800 0.600 0.436 0.312 0.141 
y 1.005 1.048 1.250 1.667 2.294 3.203 7.089 


Using the figures of the Table we draw the curves of Fig. 1.5 from which we note 
that the relativistic effects become significant for values 8 > 0.8. This is the reason 
we expect the relativistic effects to appear at high relative velocities. 


1.8 The Kinematic Interpretation of the General Lorentz 
Transformation 


1.8.1 Relativistic Parallelism of Space Axes 


In Sect. 1.7 we have shown that the general Lorentz transformation L(8, FE) can be 
written as the product of two transformations: 


L(B, E) = L(B)R(E) (1.103) 


1.8 The Kinematic Interpretation of the General Lorentz Transformation 43 


The transformation L(8) depends on three parameters B = (61, 62, 63), which may 
be considered as the components of a vector in a linear space R*. The transformation 
R(E) is a EOT and it is also defined uniquely in terms of three parameters e.g. 
the Euler angles. Each part of the general Lorentz transformation has a different 
kinematical meaning. 

In a LCF the transformation R(£) is (see equation (1.61)): 


1) 0 
Ri(£) = 1.104 
+(E) E E ( ) 
therefore it effects the spatial part of the four-vectors only. This leads to the 
following geometric and kinematic interpretation of the transformation R(E). The 
transformation R(E) : 


1. Distinguishes the components of a four-vector A’ in two groups: The temporal 
or zeroth component, which is not affected by the action of R(E), and the 
spatial part, which is transformed as a 3-vector under the action of the Euclidian 
transformation E. This grouping of the components of a four-vector is very 
important and it is used extensively in the study of Relativity Physics (Special 
and General). It is known as the 1 + 3 decomposition and it can be extended to 
apply to any tensor (see Sect. 12.2). 

2. Geometrically describes the relative orientation of the spatial axes of the LCF 
it relates. That is if K(E), E2, £3) and K’(E}, E4, E4) are two LCF related by 
the general Lorentz transformation L(B, E), then the matrix E relates the spatial 
bases of K, K’ according to the equation: 


(Ej, E}, Ey) = (Ei, Ep, E3)E. (1.105) 


This interpretation of the transformation R(£) is Euclidian in the sense that it 
refers to the relative orientation of the axes of the frames in Euclidian space E°. 
This means that we may distinguish the action of the general Lorentz transformation 
in two parts: A Euclidian part expressed by the action of the transformation R(E) 
and a relativistic part expressed by L(B). 

Therefore without restricting the generality we can get rid of the transformation 
R(E) by simply considering the general Lorentz transformation for E = J3. Then 
one is left with the relativistic part of the transformation. Now E = J; means that 
the axes of the frames K, K’ have the same orientation, that is, they are parallel. 
This parallelism is understood in the Euclidian sense and it is not Lorentz invariant. 
Therefore it has to be defined in a covariant manner in order to attain objectivity in 
the world of M+. This leads us to the following definition: 


Definition 1.8.1 Let K, K’ two LCF which are related by the general Lorentz 
transformation L(B, E) = L(B)R(E). We say that the spatial axes of K, K’ are 
relativistically parallel if, and only if, Ri (E) = I. 


44 1 Mathematical Part 


This concept of parallelism is directly comparable to the Euclidian concept of 
parallelism in E> because the later can be defined as follows: 


Definition 1.8.2 Let =, %’ two ECF which are related with the EOT E. We say 
that the axes of ©, &’ are parallel if, and only if, E = 1, where 13 = diag(1, 1, 1) 
is the identity matrix. 


The fact that the Euclidian and the relativistic parallelism of space axes are 
closely related is frequently misunderstood and has lead to erroneous conclusions. 
On the other hand it has important applications as for example the Thomas effect. 
The difference between the two types of parallelism is that they coincide for two 
LCF K, K’ but not necessarily for more. That is if K, K’, K” are three LCF such 
that the space axes of the pairs K, K’ and K’, K" are parallel in the Euclidian sense 
then it does not follow (in general) that the space axes of the LCF K, K” are parallel 
in the Euclidian sense. Of course this is not the case in the Newtonian kinematics. 

The “broken parallelism” is due to the fact that the action of the second 
general Lorentz transformation L(B x7) acts on the Euclidian part Ry (Ex) of the 
first Lorentz transformation so that the later becomes 4 [4, which “breaks” the 
parallelism of the space axes of K, K”. 

The interpretation of the transformation R,(£) in terms of the relative direction 
of the space axes is possible because we assume isotropy of 3-space therefore all 
directions in space are equivalent. From the point of view of Physics this means 
that the orientation of space axes in the geometric space does not affect the physical 
properties of physical systems. Such conditions are known as gauge conditions and 
play an important role in theoretical Physics. 


1.8.2. The Kinematic Interpretation of Lorentz Transformation 


In order to give a kinematic interpretation of the (pure) Lorentz transformation L() 
we use the fact that this transformation depends on the vector B only and not on the 
matrix E. Therefore we identify: 


L(B) = L(B, Js) 


that is, we consider that the Lorentz transformation is the general Lorentz transfor- 
mation when the axes of © and D’ are relativistically parallel. The vector B we 
identify with the relative velocity of © and &’. This kinematic interpretation is 
shown in Fig. 1.6. 

With the above interpretation, the general Lorentz transformation in M* is 
decomposed in two general Lorentz transformations. One transformation L(B, 1/3) 
in which the space axes of the related LCF are parallel and their relative velocity B 
and in a second transformation L(0, E) which rotates the axes in the Euclidian space 
E> about the (fixed) origin in order to make them parallel. The relativistic part of the 


1.9 The Geometry of the Boost 45 


Xx 


Fig. 1.6 The kinematic interpretation of Lorentz transformation 


transformation is what we call the (pure) Lorentz transformation. This interpretation 
implies that the relativistic effects show up only when we have relative motion! 

Geometrically it is possible to view the Lorentz transformation as a “hyperbolic 
rotation” around the space direction 6 with a hyperbolic angle y, defined by the 
relation: 


coshy = y. (1.106) 


The hyperbolic angle yw we called rapidity. 


1.9 The Geometry of the Boost 


As we have shown in Sect. 1.7 the Lorentz transformation L(B 9 can be expressed 
as the product of a boost and two Euclidian rotations. These rotations concern the 
direction of the relative velocity in the (parallel) axes of the two LCF related by the 
Lorentz transformation. This decomposition is helpful in practice because we solve 
a specific problem for a boost and then we use the Euclidian rotations to get the 
(usually more complicated) answer. Therefore it is important to study the geometric 
structure of the boost. 

We recall that if (/, x, y, z) and (J’, x’, y’, z’) are the coordinates of a point P in 
M? in the LCF & and »’ respectively then the boost L() (to be precise we should 
write L(B),) along the direction of the x—axis is the transformation: 


x =y(x-pl, y=y, vg=z, l=y(l— Bx). (1.107) 


30Not the general Lorentz transformation L(B, E) ! 


46 1 Mathematical Part 


Let us represent this transformation on the Euclidian plane*! (J, x) — the y, z 
coordinates are not affected by the transformation and we ignore them. The boost 
L(B) and its inverse Lo! (6) are represented with the following matrices: 


16) =| s eal £16) =| Y al (1.108) 
—By y By y 


Because L(f) is a linear transformation it is enough to study its action on the 
basis vectors €9,5 = (1,0)s, e1,5 = (0, 1)» of the LCF &. We have: 


(eo,5/, 1,5’) = (€o,2,e1,n)L >! = (ens.e1.0)| “i ‘fl = 
By 


eos = veosr t+ Byes, C15 = byeost+ yes. (1.109) 


Concerning the Euclidian length and the Euclidian angles of the basis vectors 
we have: 


leo,s| = le1,5] = 1 


j1+ p? 
leo,s’| = le1.a"] =yV1+ 6? = T- x aoa | (1.110) 


2 
€0,5 ei =0, €o,5/  e1,5 = 2By 


€0,D °° €or) = €1,D °° e1,n = Y- 


These results lead to the following representation of the basis vectors in the 
Euclidian plane (J, x) (see Fig. 1.7): 
In Fig. 1.7 the Euclidian angle ¢ is defined by the relation: 


tang = Bp. (1.111) 


We note that the vectors eg sy = (1,0)5, e157 = (0, Ixy make equal angles 
with the basis vectors e9s = (1,0), e1,5 = (0, 1)» and furthermore they are 
symmetric about the internal bisector. When 8 = 1 tang@ = 1 hence @ = 45° and 
the vectors e€9. 57 = (1,0)s", e1,5. = (0, 1)’ coincide with the internal bisector 
1 = x of the axes (/, x). 


31tt is generally believed that this plane is a two dimensional Minkowski space, that is, a two 
dimensional linear space endowed with the Lorentz metric. This is wrong and leads to many 
misunderstandings in Special Relativity. It should always be born in mind that the blackboard 
is and stays a 2-d Euclidian space no matter what we draw on it! 


1.9 The Geometry of the Boost 47 


Fig. 1.7 The action of boost 
on the basis vectors 


If the LCF & and D’ are moving with factor —6 then the vectors e9.5” = 
(1,0)5”, @1,5” = (0, 1)” make a common external angle ¢ with the vectors 
éeo,r = (1,0)s, e@1.5 = (0, 1)» (See Fig. 1.7 ). 


Exercise 1.9.1 Prove that the Lorentz length of the basis vectors eo,5, €1,5, €0,5', 
€1,x/ equals 1. We define the Lorentz angle $1, between two bases vectors e.g. the 
vectors €,y, €1,y' by the formula 


ey roe, yy 


cosh ¢, = —-————_ 
ler slolerslz 


(1.112) 


where o indicates Lorentz product in the plane (1, x). Show that the vectors of each 
basis are Lorentz orthogonal and that the Lorentz angle between e.g. the vectors 
€1,5, €1,x/ is given by cosh¢y = y, therefore 6, = W where w is the parameter 
given in (1.35) satisfying the relation coshy = y. 


Note that in (1.112) we are using cosh not cos because y > 1. 


Exercise 1.9.2 Show that the Euclidian lengths satisfy the relation: 


1 
ey] = ———e f= 0,1 1.113 
le;, >| Jase. ilo (1.113) 


where @ is the Euclidian angle** defined in (1.111) and conclude that: 


le; x] > lei,xI- (1.114) 


32Note that ge(—4, 4) therefore 2¢e(—F, 5) and cos 2 > 0. 


48 1 Mathematical Part 


The geometric meaning of the inequality (1.114) is that the Euclidian length of 
the unit rod along the x’ axis is larger than the Euclidian length of the unit rod along 
the x axis. Therefore an object lying along the x axis (e.g. a rod) when it is measured 
with the unit of & gives the number d(X) and when it is measured with the unit of 
D’ gives another number d’(X’) smaller than d(X) because: 


d(X)le;,5| = d'(Z’)lej,y’I. (1.115) 


The physical meaning of the inequality d’(X’) < d(X) is that the (Euclidian!) 
length of the rod as measured in &’ (that is the number d’(X’)) is smaller that 
the (Euclidian!) length d(£) as measured in &. This inequality of (Euclidian!) 
length measurements has been called length contraction. Concerning the unit 
along the timelike vectors e9,5, €9,x we identify d(x), d’(X’) with the inverse of 
time durations respectively. Then the inequality d’(x’) < d(X) means that the 
(Newtonian!) time duration of a phenomenon in & is smaller than the (Newtonian!) 
duration of the same phenomenon in &’. This result has been called time dilation 
effect. Both length contraction and time dilation will be discussed in Chap. 5. 

A simple way to draw the vectors ep, and e; 7, on the Euclidian plane E 2 is the 


‘ . ‘ : : x 
following. We consider a Lorentz unit vector (in E7!) with components ( ) and 
y 


demand: 


from which follows: 


=a 4 y? =i. 


We infer that the tip of the Lorentz unit vectors moves on hyperbolae with 
asymptotes x = +y (see Fig. 1.8). 

If we consider orthonormal coordinates (/, x) in the Euclidian plane then these 
hyperbolae are symmetric about the bisectors at the origin. For each value of the 
parameter f the Lorentz unit vectors make an angle @ with the /, x axes. In order 
to compute this angle in terms of the parameter 6 we consider the Euclidian inner 
product of the vectors eo,7, €1,~ with the basis vectors i, j. For the vector e;,, we 
have: 


e,,,-i=cosdlei cle 
from which follows: 


(1.116) 


1.9 The Geometry of the Boost 49 


Fig. 1.8 The Lorentz transformation in the Euclidian plane 


> 
X,x\x" 


Fig. 1.9 Combination of 6 factors under successive boosts 


With this result we can draw in the Euclidian plane (/, x), the vectors eo,7 and e1,7, 
for every value of the parameter f. 


Example 1.9.1 Consider three LCF &, X’, =”, which are moving in the standard 
configuration with factors 6, B2 respectively along the common axis x, x’, x”. Use 
the geometric representation of the boost to compute the factor 6 between the LCF 
xX, D” (see also (1.100)). 


Proof From Fig. 1.9 we have tan ¢3 = tan(¢@, + ¢2). Hence: 


_ Bite 


P35 TB By 


(1.117) 


50 1 Mathematical Part 


Another simple method to represent the boost in the Euclidian plane (J, x) is the 
following. We consider two orthogonal axes /, x and consider the boost (1.107), 
which defines the new axes /’, x’ on the same plane. The /’ axis is defined by the 
requirement: 


x =05x= 61 


therefore it is a straight line with an inclination 8 wrt the axis /. Similarly the x’ axis 
is defined by the requirement /’ = 0, therefore it is the set of points: 


l= Bx 


which is a straight line with inclination f wrt the axis x. 

In Fig. 1.7 we have taken the axes /, x to be orthogonal but this is not necessary 
and what we say holds for non-orthogonal axes. We have also draw the axes x”, 1” 
with inclination —6. oO 


1.10 Characteristic Frames of Four- Vectors 


We have divided the four-vectors in M* in timelike, spacelike and null according 
to if their (Lorentz) length is < 0,> 0 or = O respectively. In this section we 
show that the first two types of four-vectors admit characteristic LCF in which they 
retain their reduced form, that is the timelike vectors have zero spatial components 
and the spacelike vectors have zero time component. As we shall see in the 
subsequent chapters the reduced form of the timelike and the spacelike vectors is 
used extensively in the study of Lorentz geometry and in the Theory of Special 
Relativity. 


1.10.1 Proper Frame of a Timelike Four- Vector 


Consider the timelike four-vector A’ which in the LCF © has decomposition 


0 
( ) . When the zeroth component A° > 0 (respectively A° < 0) we say that the 
x 


four-vector A! is directed along the future (respectively past) light cone. Because 
the proper Lorentz transformation preserves the sign of the zeroth component it is 
possible to divide the timelike four-vectors in future directed and past directed. 
For every timelike four-vector A’ we have A? — (A®)? < 0 therefore there exist 
always a unique LCF, =* say, in which the spatial components of A’ vanish and 


1.10 Characteristic Frames of Four-Vectors 51 


+ 
A° 
the four-vector takes its reduced or canonical form ( 0 . The frame Xt we 
yt 


name the proper frame of the four-vector A’. 

In order to determine ©*+ when A’ is given in an arbitrary LCF ©, we must 
find the B—factor B of =* with respect to ©. This is done from the proper Lorentz 
transformation (1.51) and (1.52) using the reduced form of A‘. We find: 


A® = y(A®)t (1.118) 
A= yB(A°)*. (1.119) 
From these relations follows: 
AQ 
ae (1.120) 
A 
B= UL (1.121) 


Equations (1.120) and (1.121) fix the proper frame ©* of the four-vector Al 
by giving the three parameters 6” wrt an arbitrary LCF & in which we know the 
components of the four-vector A’. 

In the proper frame D* of A! the length A'A; = (Ar)? therefore the 
component At is not simply a component but an invariant (tensor), so that it 
has the same value in all LCF. This fact we use extensively in relativistic Physics in 
order to define timelike four-vectors which have a definite physical meaning. Indeed 
we define the timelike four-vector in its proper frame and then we compute it in any 
other LCF using the appropriate Lorentz transformation. One such example is the 
4-velocity of a relativistic particle which, as we shall see, it is a four-vector with 
constant length c, where c is the speed of light in empty space. Because in Special 
Relativity we consider c to be an invariant (in fact a universal constant) we define 
the four velocity of the relativistic particle in its proper frame to be u! = (s) : 

yt 
In any other frame with axes parallel to those of X* and with 6—factor relations 


(1.51) and (1.52) give that the 4-velocity is ui = ( ae ) F 
ycB/, 


1.10.2. Characteristic Frame of a Spacelike Four- Vector 


: 0 
Let A’ be a spatial four-vector which in a LCF & has decomposition a ) . We 
x 


are looking for another LCF ©~ in which A’ has the reduced form i) : 
> 


52 1 Mathematical Part 


The two decompositions of the four-vector are related with a proper Lorentz 
transformation, therefore the transformation equations (1.51) and (1.52) give: 


Ao = yp- Aq (1.122) 


4 el 
A=A>+ B 


(B- A )B. (1.123) 


We consider the Euclidian inner product of the second equation with B and get: 


AQ 
p= A, 
where Aj; = BA’ It follows that it is not possible to define B completely as in the 


case of the timelike four-vectors. Therefore there are infinitely many LCF in which 
a spacelike four-vector has its reduced form. 

However in the set of all these frames there is a unique LCF defined as follows. 
We consider the position four-vectors of the end points A, B of the spacelike four- 
vector AB! and assume that in a characteristic frame ©~ they are decomposed as 

0- 0- 
( al ) ‘ ( ~ ) . Then we specify a unique characteristic frame by the 
Ae, > Beg = 
condition A~é, = —Boe g. This particular characteristic frame of the spacelike 
four-vector AB! we call the rest frame of AB‘. The rest frame is used to describe 
the motion of rigid rods in Special Relativity. 


1.11 Particle Four- Vectors 


The timelike and the null four-vectors play an important role in Special (and 
General) Relativity, because they are associated with physical quantities of particles 
and photons respectively. Since in many problems the study of particles and photons 
is identical it is useful to introduce a new class of four-vectors, the particle four- 
vectors. 


Definition 1.11.1 A four-vector is a particle four-vector if and only if: 


¢ Itis a timelike or a null four-vector 
¢ The zeroth component in a LCF is positive (i.e. it is future directed). 


For particle four-vectors we have the following result which is a consequence of 
Proposition 1.6.1. 


Proposition 1.11.1 The sum of particle four-vectors is a particle four-vector. 


This result indicates that geometry allows us to describe systems of particles 
with corresponding systems of particle four-vectors and, furthermore, to study the 


1.11 Particle Four-Vectors 53 


reaction of these particles by studying the sum of the corresponding particle four- 
vectors. We consider two cases: The case of parallel propagation of a beam of 
particles and the triangle inequality in M*. 


Proposition 1.11.2 Jf two future directed null vectors are parallel (antiparallel), 
then their spatial directions are parallel (antiparallel) for all observers. Equiva- 
lently the property of 3-parallelism (3-antiparallelism) of null vectors is a covariant 
property. 

Proof Consider two future directed null vectors which in the LCF & have compo- 
nents** 


1 
Ath = Eqn ( ) En>0, =1,2. 
e(7) 


We have: 
A(y//AQ) => At) = NA) (A € R) 
where A = +1 for parallel and 7 = —1 for antiparallel. Considering the components 


of the four-vectors in & we have: 
Ea) =A£Q), €(1) = Ae) 


Oo 


Proposition 1.11.3 Let O, A and B be three points in M* such that the four-vectors 
OA®% and AB® are future directed timelike four-vectors. Then the four-vector O B“ 
is a future directed four-vector and the absolute value of the Lorentz lengths 
(positive numbers) of the three four-vectors satisfy the relation: 


(OB) > (OA) + (AB) (1.124) 


where the equality holds if, and only if, the three points O, A, B_ are ona straight 
line. 


33 second proof is by means of Theorem 17.4.1. Indeed from that theorem we have 
2 
[ot (4%,,) | =0, hence: 


(FE, + E2,e, FE, + e) En)? =0> 


(E, + E>)? + (e E} +e2£2)? =0 


ey-e2 = 15> e)//e2 > 1 E)//e2E2. 


54 1 Mathematical Part 


Proof The linearity of M* implies: 
OB" = OA‘ + AB*. 


From Proposition 1.11.1 we conclude that O B® is a timelike four-vector and future 
directed. The length: 


— (OB) = —(OA)* — (AB)? + 20A°ABg. (1.125) 


In the proper frame of OA“ we have: 


AB® 


At d that AB? = 
(O > 0) and suppose thai ( AB 


) where AB° > 0 because A B@ 
xt 


(OA) 
is future directed. The invariant: 


OA“ AB, = —OAT AB’. 


But —(AB)2 = —(AB)”” + (AB)? < 0 because AB® is timelike, hence (AB)° > 
(AB) > 0 = OAt(AB)® > OA*t(AB). Finally: 


OA‘ AB, < (OA)(AB) 


where the equality holds only if (AB) = 0, that is the four-vectors OA“ and AB? 
are parallel, hence the three points O, A, B lie on the same straight line (in M‘*). 
Replacing in (1.125) we find: 


(OB)* = (OA)? + (AB)? + 20A%ABg > ((OA) + (AB))* => (OB) > (OA) + (AB) 


which completes the proof. Oo 


The result of Proposition 1.11.3 can be generalized for a polygon consisting 
of (n — 1) future directed timelike four-vectors OA;A2...A,. In this case the 
inequality reads: 


(OAn) > (OA) + (A1A2) +... + (An-1An)- (1.126) 
We note that in M* the triangle inequality (1.124) is opposite to the correspond- 


ing inequality of the Euclidian Geometry. As we shall see this geometric result is 
the reason for the mass loss in relativistic reactions. 


1.12 The Center System (CS) of a System of Particle Four-Vectors 55 


1.12 The Center System (CS) of a System of Particle 
Four- Vectors 


a 
Let Alp: 


null and parallel. According to Theorem 17.4.1 their sum A“ = )“)_, A) is a 


I = 1,...,n bea finite set of future directed particle vectors not all 
timelike four vector. The proper frame of the vector A“ we denote with X* and call 


the Center System** (CS) of the set of the four-vectors At: IT =1,...,n.In 
the CS A“ has components: 
Hes Ce 
0 ae 


where (A°)+ = ./—A®A, is an invariant. 


Exercise 1.12.1 Assume that in their CS + the four-vectors Alp: ‘> eee 
gt 
have components A“, = i . Show that: 
m Ay / 5+ 


n n 
0+ _ + ae 
Peay Be YAP So 
I=1 I=1 


where A°+ is the zero component of the sum A% in &*. 


Exercise 1.12.2 Prove that the y and B factors of the CS in & are given by the 
relations: 


wa AY 
ja (1.127) 
n 
A 
p= Dera Ar (1.128) 
Dra AY 
l=177 


Verify that the above quantities satisfy the relation y? = a 
Hint: Recall that if a timelike four-vector in a LCF X& has components A% = 
AQ 
( A ) then the y and B factors of the proper frame of A% in & are given by 
x 


the relations: 


A° A 
i= one B= 50° (1.129) 


34Tn case all four-vectors are null and parallel then A“ is also null and parallel to these four-vectors 
and in that case we do not define a CS. 


Chapter 2 
The Structure of the Theories of Physics on 


2.1 Introduction 


The mechanistic point of view that the ultimate scope of Physics is to “explain” 
the whole of the physical world and the numerous phenomena in it, is no longer 
widely acceptable and hides behind it cosmogonic and theocratic beliefs. It is 
rather safer to say that nowadays we believe that Physics describes the physical 
phenomena by means of “pictures”, which are constructed according to strictly 
defined procedures. What a “picture” of a physical phenomenon is and how it is 
constructed is a very serious philosophical subject. Equally serious is the assessment 
of when these “pictures” are to be considered successful or “real”. Obviously the 
present book cannot address these difficult questions in depth or in extent. However 
an “answer” to these questions must be given if we are going to develop Special 
Relativity (in fact any theory of Physics) on a firm conceptual basis and avoid the 
many misunderstandings which have accompanied this theory for long periods in its 
history. 

Consequently, avoiding difficult questions and obscure discussions, we demand 
that a physical phenomenon shall be described by a set of physical quantities which 
measure/characterize the organization of the physical systems participating in the 
phenomenon. These physical quantities are the elements comprising the “picture” 
of the physical phenomenon. Furthermore we agree that the “picture” of a physical 
phenomenon will not be unsuccessful if it predicts/explains/describes to within a 
“reasonable” accuracy and idealistic approximations, the physical phenomenon as 
the result of an organization of some relevant physical systems. For example, if we 
have a simple pendulum of length / in a gravitational field g, then the prediction 
that the period T of the pendulum, under specified experimental conditions and 


© Springer Nature Switzerland AG 2019 57 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_2 


58 2 The Structure of the Theories of Physics 


idealistic approximations, is given by the formula 27 .///g can be verified within 
some acceptable limits of accuracy. The above lead to the following questions: 
a. How do we “specify/describe” a physical quantity? 
b. When shall the descriptions of a physical quantity be equal or “equivalent”? Are the 
“pictures” of a given physical quantity always the same ones? 


c. How do we differentiate or “measure” the differences between two descriptions of the 
same physical quantity? 


All these questions and many more of the same kind have been and will be 
posed throughout the course of human history. To these questions there are not 
definite and unique answers and this is the reason we have made extensive use 
of quotation marks. The answers given to such questions can only be beliefs and 
“reasonable” explanations which, in turn, are based on other more fundamental 
beliefs and explanations and so on. This infinite sequence of reasoning is the realm 
of philosophy. However in science, and more specifically in Physics, we cannot 
afford the luxury of an endless series of questions, beliefs and explanations. This is 
because Physics is an applied science, which acquires meaning and substance in the 
laboratory and in everyday practice. For example there is no room for beliefs and 
explanations concerning the take-off of an airplane, or the safety of a nuclear reactor. 
In conclusion, the practicalities of life itself impose upon us a definite “real” world 
of objective physical phenomena to which science is called upon to consider and 
offer systematically formed views and propositions corresponding to its purposes. 
This “imposed” reality is the realm of science and this is the “world” we want 
to “explain” in terms of the human concept of reality, created by our senses and 
perception. 

In the following sections we develop one approach/explanation as to how the 
methodology of Physics as a science, is systematically formed. Obviously other 
writers have a different approach and the reader may have his/her own. However no 
matter which approach one takes the common agreement is that: 


a. There is not “the correct” approach but some of them are more “successful” than 
others in certain subsets of phenomena 

b. Whichever approach is adopted, the final numeric answer to a given physical phe- 
nomenon or set of phenomena must be within the accuracy of the experimental 
measurements or observations, otherwise the approach is not acceptable. 


2.2 The Role of Physics 


Our interaction with the environment is always by means of some kind of sensors. 
These sensors can be either the sensors of the human body (direct action) or man- 
made sensors such as lenses, clocks, meters etc (indirect action). The first type of 
sensors we shall call basic sensors and the second type measuring sensors. Physics 
is the unique science which is concerned with the development, use and study of the 


2.2 The Role of Physics 59 


measuring sensors. This gives Physics a distinct position against all other sciences 
in the study for an “objective” reality. 

The basic function of the measuring sensors is the observation of a physical 
phenomenon by its association with one or more characteristic quantities. The result 
of an observation is (as a rule) a set of numbers. Therefore observation is the 
procedure with which a physical quantity is described in the mathematical world of 
geometry. We note the following facts about observers, observations and physical 
quantities: 


e The result of an observation (that is, the associated numbers) depends on the 
specific observer performing the observation 
¢ The nature of the physical quantity is independent of the observer observing it 


We are therefore led to demand that the sets of numbers associated by the 
various observers to a given physical quantity shall be related by means of some 
transformations. Mathematically this means that the sets of numbers of each 
observer are the components of a geometric object for that observer. Equivalently 
we can say that the mathematical description of the physical quantities will be 
done in terms of geometric objects in a proper geometric space. We emphasize that 
the result of an observation is not an element of the objective world (reality), but 
an element of the world of geometry. We call the description of a phenomenon 
by means of a set of geometric objects the image of the observed physical 
phenomenon. 

According to this view the sensors we use for the observation of the real world 
create another world of images. Science studies this world of images and tries to 
discover its internal structure, if any, or the internal structure of a subset of images 
of a given class. The mechanistic point of view, which prevailed at the end of the 
nineteenth century, can be understood as the belief that there exists one and only 
one internal structure of the world of images and the scope of science is to discover 
this structure. Today, as we remarked at the beginning of this section, this position 
is considered as being too strong. It is our belief that there exist internal structures 
in subsets of certain types of images, that do not appear in the set of all images of 
the objective world, i.e. different structures for different subsets. ! 

In the game of science Physics enters with a special role. The fundamental 
requirement of Physics is that whichever internal structure relates a subset of 
images, it must be “objective”, or to put it in another way, non-personalized. Due 
to this requirement, in Physics one uses (in general) images which are created by 
measuring sensors only. 


‘In our opinion the question of the existence of a universal internal structure involving all images 
of the objective world is equivalent to the question if there is an omnipresent super power or not! 


60 2 The Structure of the Theories of Physics 


Furthermore, it is required that Physics be concerned only with images which 
have a definite qualitative and quantitative character. This means that the images 
studied in Physics will be “geometric”, that is they will be described by means 
of concrete mathematical quantities defined on mathematical sets with definite 
mathematical structure (linear spaces, manifolds etc.). These spaces constitute 
the realm of physics. The type and the quantity of the geometric objects which 
describe the images of a theory, geometrize these images and define the “reality” 
of that theory of Physics. For example, in Newtonian Physics the physical quantity 
“position” is associated to the geometric element position vector in the Euclidian 
three dimensional space. 

Indeed physics has shown that the set of images created by the measuring sensors 
is divided into multiple subsets, with each subset having its own internal structure. 
This has resulted in many theories of Physics, each theory having a different “real- 
ity”. For example Newtonian Physics has one reality (that of the Newtonian physical 
phenomena geometrized by Euclidian tensors in three dimensional Euclidian space), 
while Special Relativity has another (the special relativistic phenomena geometrized 
by Lorentz tensors defined on a four dimensional flat metric space endowed with a 
Lorentzian metric). The unification of these different realities as parts of a super or 
universal reality has raised the problem of the unification of the theories of Physics 
in a Unified Theory, a problem which has occupied many distinguished physicists 
over the years and which remains open. 

Each theory of Physics is as valid as any other within in the subset of images 
to which it applies. Therefore the statement “this theory of Physics is not valid” 
makes no sense. For example Newtonian Physics holds in the subset of Newtonian 
phenomena only and does not hold in the subset of relativistic phenomena. Similarly 
Special Relativity holds in the subset of special relativistic phenomena only. 

The above analysis makes clear that every theory of Physics is intimately and 
uniquely related to the geometrization of the subset of images of the world it studies. 
This geometrization has two branches: 


a. The correspondence of every image in the subset with a geometric object of 
specific type 

b. The description of the internal structure of the subset of images by means of 
geometric relations among the geometric objects of the theory (these include — 
among others — the laws of that theory of Physics). 


Figure 2.1 describes the above by means of a diagram. We note that in Fig. 2.1 
every subset of images is defined by a different window of measuring sensors. In 
the following sections we discuss the general structure of a theory of Physics in 
practical terms. 


2.3 The Structure of a Theory of Physics 61 


Fig. 2.1 The structure of a 
theory of physics 


GEOMETRY 


PHYSICAL 
THEORY 


SPQ = Set of physical quantities of a theory of Physics 
W = Window of measuring sensors 
S1 = Set of images of measuring sensors 


S2 = Set of images of measuring sensors which 
can be geomertized 


S3 = Set of images of a theory of Physics 


2.3 The Structure of a Theory of Physics 


As we mentioned in the last section the images studied in Physics are created by 
means of the measuring sensors and are described mathematically by geometric 
objects. The images of physical phenomena are generated by the observers by 
observation. 

The observer in a theory of Physics is the “window” of Fig. 2.1. Its sole role is to 
generate geometric images of the various physical phenomena, which are observed. 
In practical terms we may think of the observer as a machine or robot (ME or I does 
not exist in Physics!) which is equipped with the following: 


e A set of specific measuring sensors, which we call observation means or 
observation instruments 
e A definite set of instructions concerning the use of these sensors. 


Depending on the type of the observation instruments and the directions of 
use the observers create geometrized images of the physical phenomena. Every 
theory of Physics has its own observers or, equivalently there are as many types 
of observers as viable theories of Physics. For each class of observers there 
corresponds a “reality”, which is the world made up by the subset of images involved 
in the theory. 


62 2 The Structure of the Theories of Physics 


In a sense the observers operate as generalized functions called “functors”? 
from the set of all physical phenomena to the space of images R” (for a properly 
defined n). The value of n is fixed by the characteristic quantity of the theory. For 
the theories of Physics studying motion the space of images is called spacetime and 
the characteristic quantity is the position vector. 

For example in Newtonian Physics the characteristic quantity of the theory 
is motion and it is described with the image path and orbit images in a linear 
space. Practical experience has shown that the path or orbit is fully described with 
three numbers at each time moment, therefore the spacetime is a four (n = 4) 
dimensional linear space (endowed with an extra structure, which we shall consider 
in the following). There is no point to consider a spacetime of higher dimension in 
Newtonian Physics, because the extra coordinates will always be zero, and therefore 
redundant. In general the dimension of the space of a theory of Physics equals the 
minimum number of components required to describe completely the characteristic 
quantity of the theory. We call this number the dimension of the space of the theory. 

We summarize the above discussion as follows: 


e The observer of a theory of Physics is a machine (robot) which is equipped with 
specific instrumentation and directions of use, so that it can produce components 
for the characteristic quantities of the physical phenomena. 

¢ Observation or measurement is the operation of the observer which has as a 
result the production of components for the characteristic quantities of a specific 
physical phenomenon or physical system. 

¢ There does not exist a unique type of observers nor a unique type of observation. 
Every class of observers produces a set of images of the outside world, which is 
specific to that class of observers and type of observation. For example, we have 
the sequences: Newtonian Observers — Newtonian Physics — Newtonian world, 
Relativistic Observers — Relativistic Physics — Relativistic world. 


2.4 Physical Quantities and Reality of a Theory of Physics 


From the previous considerations it becomes clear that the image of a physical 
phenomenon depends on the observer (i.e. the theory) describing it. However, 
the fundamental principle of Physics is that the image of a single observer for a 
physical phenomenon has no objective value. The “reality” of one has no place in 
today’s science and specifically in Physics, where all observers are considered to be 
(within each theory!) equivalent and similar in all aspects (clones of the same robot). 
The objectivity of the description of physical phenomena within a given theory of 
Physics is achieved by means of the following methodology: 


2A functor in the branch of Mathematics called Category Theory is a mapping of objects in one 
category to objects in a second category that preserves relationships between objects Obviously 
this type of mathematics is outside the scope of this book. 


2.4 


Physical Quantities and Reality of a Theory of Physics 63 


Characteristic quantity 
of physical phenomenon 


Observation of observer 1 Observation of observer 2 


Geometric picture Geometric picture 
Procedure of existence 
= 


of observer 1 of observer 2 


of physical quantity 


Fig. 2.2. Generating diagram of principles of relativity 


. We consider the infinity of the specific type of observers (i.e. identical robots 


equipped with the same instrumentation and the same programming) used by the 
theory 


. We define a specific and unique code of communication and transformation 


amongst these observers, so that each observer is able to communicate the image 
of any physical quantity to any other observer in that same class of observers. 


. We define a procedure which we agree will “prove” that a given physical 


phenomenon is described successfully by a specific theory of Physics. 


Following the above approach, we define the “successfulness” of the image of a 


phenomenon within a (any) theory of Physics by means of the procedure described 
in the general scheme of Fig. 2.2. 
In Fig. 2.2 there are two major actions: 


1. 


The observation of the physical phenomenon by an observer with the measuring 
means and the procedures, defined by the specific theory of Physics. The result 
of each observation is the creation of geometric objects (in general elements of 
R”) for the characteristic quantities of the observed physical phenomenon. 


. The verification of the “objectivity” of the concerned physical phenomenon by 


the comparison of the geometric images* created by procedure 1. This second 
activity involves the transfer of the image of a phenomenon observed by observer 
1 to observer 2 and vice versa, according to the esoteric code of communication 
and transformation of the observers specified by the specific theory of Physics. 
More specifically, observer 2 compares the image of the phenomenon created 
by direct observation (direct image) with the communicated image of observer 
1 (communicated image). If these two images do not coincide (within the 
specified limits of observation and possible idealizations) then the physical 
phenomenon is not a physical quantity for that specific theory of Physics. 
If they do, then observer 2 communicates his direct image to observer 1, who 
accordingly repeats the same procedure. If the direct image of observer | 
agrees with the transformed image of observer 2 then the concerned physical 


3By image we do not mean raw data but instead transformed or processed data. 


64 2 The Structure of the Theories of Physics 


phenomenon is potentially a physical quantity for that theory of Physics. If the 
coincidence of the images holds for any pair of observers of the specific theory 
then this physical phenomenon is a physical quantity for that theory of Physics. 


The set of all physical quantities of a theory comprises the reality of this theory 
of Physics. We note that reality for Physics is relative to the theory used to “explain” 
the physical phenomena. That is, the reality of Newtonian Physics is different 
from the reality of the Theory of Special Relativity, in the sense, that the physical 
quantities of Newtonian Physics are not physical quantities for Special Relativity 
and vice versa. For example the physical quantities time and mass of Newtonian 
Physics do not exist in the Theory of Special Relativity in the sense that in the 
former they are invariants whereas are not so in the latter. For the same reason, the 
speed of light and the four momentum of Special Relativity do not exist (i.e. are not 
physical quantities) in Newtonian Physics. 

It must be understood that the reality of the physical world we live in is beyond 
and outside the geometric reality of the theories of Physics. On the other hand the 
reality of the world of images of a theory of Physics is a property which can be 
established with concrete and well-defined procedures. This makes the theories of 
Physics the most faithful creations of human intelligence for the understanding and 
the manipulation of the physical environment. The fact that these creations indeed 
work is an amazing fact which indicates that the basic human sensors and the human 
“operational system” (that is the brain) function in agreement with the laws of 
nature. This is the prime reason which led to the initial identification of the observer 
— robot with the actual human observer. 

The procedure with which one assesses the existence of a physical quantity in 
the realm of the world of a theory of Physics we call the Principle of Relativity of 
that theory. The Principle of Relativity involves transmission (i.e. communication) 
of geometric images among two observers of a theory of Physics, therefore it is 
of a pure geometric nature and can be described mathematically in appropriate 
ways. It must be clear that it is not possible to have a theory of Physics without 
a Principle of Relativity, because that will be a theory without any possibility of 
being tested against the real phenomena. In the sections to follow we shall show 
how these general considerations apply in the cases of Newtonian Physics and the 
Theory of Special Relativity. 

From the above considerations it is wrong to conclude that the identification a 
physical quantity of a theory of Physics is simply a comparison of images (direct 
and communicated image) based on the Principle of Relativity of the theory. Indeed, 
the identification of a physical quantity is a twofold activity: 


a. A measuring procedure (the observation, direct image) with which the geometric 
image of the quantity is created; this involves one observer and one physical 
phenomenon 

b. A communication, transformation, and comparison of images, which involves 
two observers and a specified physical phenomenon. 


2.5 Inertial Observers 65 
2.5 Inertial Observers 


The Principle of Relativity of a theory of Physics concerns the communication 
(exchange of information — images) between the observers of the theory. Geometri- 
cally this is expressed by groups of transformations in the geometric space of images 
of the theory. But how does one define a Principle of Relativity in practice? 

In the present book we shall deal only with the cases of Newtonian theory and the 
Theory of Special Relativity. Therefore in the following when we refer to a theory 
of Physics we shall mean one of these two theories only. 

As we have said in the previous sections, the observers of a theory of Physics are 
machines equipped with specific instrumentation and directions of use and produce 
for each physical quantity a set of numbers, which we consider to be the components 
of a geometric object. Two questions arise: 


¢ In which coordinate system are these components measured? 
¢ What kind of geometric objects will be associated with this set of components or, 
equivalently, which is this group of transformations associated with that theory? 


Both these questions must be answered before the theory is ready to be used in 
practice. 

The first question involves essentially the geometrization of the observers, that 
is, their correspondence with geometric objects in R”. These objects we consider to 
be the coordinate systems. In a linear space there are infinitely many coordinate 
systems. One special class of coordinate systems are the Cartesian Coordinate 
systems, which are defined by the requirement that their coordinate lines are straight 
perpendicular lines, that is, they are described by equations of the form: 


r=as+b 


where a, b are constant vectors in R” ands € R. 

If the Cartesian coordinate systems are going to have a physical meaning in a 
theory of Physics (this is not necessary, but it is the case in Newtonian Physics and 
in the Theory of Special Relativity) the observers who are associated with these 
systems must have the ability to define the necessary number (3 and 4 respectively) 
of straight lines* in the real world by means of a physical process. The observers 
who have this ability we call inertial observers. 

In a theory of Physics the existence of these observers is guaranteed with an 
axiom. In Newtonian Physics this axiom is Newton’s First Law and in the Theory 
of Special Relativity there exists a similar axiom (to be referred to later on). At this 


4A straight line in a linear metric space can be defined in two different ways. Either as the curve to 
which the tangent at every point lies in the curve (this type of lines are called autoparallels) or as 
the curve with extreme length (maximum or minimum) between any of its points (these curves are 
called geodesics). These two types of curves need not coincide, however they do so in Newtonian 
Physics and Special Relativity. 


66 2 The Structure of the Theories of Physics 


point it becomes clear that the inertial observers of Newtonian Physics are different 
from those of Special Relativity. For this reason we shall name the first Newtonian 
Inertial Observers (NIO) and the latter Relativistic Inertial Observers (RIO). 


2.6 Geometrization of the Principle of Relativity 


The Principle of Relativity is formulated geometrically by means of two new 
principles: The Principle of Inertia and the Principle of Covariance. 

The first defines the inertial observers of the theory and the second the geometric 
nature (that is the type of the mathematical objects or the transformation group) of 
the theory. Let us examine these two principles in some detail. 


2.6.1 Principle of Inertia 


From the point of view of Physics the concept “inertial observer’ concerns the 
observer — robot as a physical system and not as a Cartesian coordinate system. 
There are not inertial coordinate systems, but only inertial observers who are related 
to the Cartesian systems by means of a specified procedure. An inertial observer 
can use any coordinate system to report his/her measurements (observations). For 
example, an inertial observer in R* can change from the Cartesian coordinates (x, y) 
to new coordinates u,v defined by the transformation vu = xy, v = x — y and 
subsequently express all his measurements in the new coordinate system (u, v). 
Obviously the coordinate system (u,v) is not Cartesian. However this is not a 
problem because it is possible for the observer to transfer the data to a Cartesian 
coordinate system by means of the inverse transformation. (Think of the similar 
situation where a novel written in one language is translated either to other dialects 
of the same language or to other languages altogether. The novel as a story in all 
these languages is the same). The inertiality of an observer is a property which 
can be verified and tested experimentally. The Principle of Inertia specifies the 
experimental conditions, which decide on the non-inertiality of an observer. The 
Principle of Inertia is different in each theory of Physics. 

According to the above the question weather a given observer — robot is inertial 
makes sense only within a specified theory of Physics. For example in Newtonian 
Physics it has been found that the coordinate system of the distant stars can 
be treated as being inertial. Similarly it has been verified that (within specified 
experimental limits) the coordinate system based on the solar system is a Newtonian 
inertial observer, for restricted motions and small speeds (compared to the speed of 
light) within the solar system. 

In order to prevent questions and misunderstandings which arise (as a rule) with 
the concept of the inertial observer in Dynamics, we consider the Second Law of 
Newton. This law does not define the Newtonian inertial observers but concerns the 


2.6 Geometrization of the Principle of Relativity 67 


study of the motion (all motions) in space by Newtonian inertial observers. That is, 
this law does not make sense for non-inertial Newtonian observers, for example 
for Relativistic Inertial observers (to be considered below). The mathematical 
expression of this law is the same for all Newtonian inertial observers and defines 
the Newtonian physical quantity force. For non-Newtonian inertial observers this 
law makes no sense and the physical quantity Newtonian force is not defined as, for 
example, is the case with Special Relativity. 


2.6.2 The Covariance Principle 


After establishing the relation between the inertial observers of a theory with the 
Cartesian coordinate systems in R”, it is possible to quantify geometrically the 
Relativity Principle of that theory. This is achieved with a new principle, which 
we call the Covariance Principle and describe as follows. 

Let K the set of all Cartesian coordinate systems in R” (for proper n), which 
correspond to the inertial observers of the theory. Let GK the group of all (linear!) 
transformations between the Cartesian systems in K. We demand the geometric 
objects which describe the physical quantities of the theory to be covariant under 
the action of GK. This means that if X, X’ are two elements of K with X’ = BX 
where B is an element (nm x n matrix with coefficients parameterized only by the 
parameters which relate ©, ©’) of GK and T the eke object ee 


to a physical quantity of the theory with components ii ‘ in D/ and Ty in D, 
then the two sets of components are related by the following relation/transformation: 


leg? ae i op ee 
Ce El cB) Bia B de: 


In Newtonian Physics the Cartesian coordinate systems (that is, the set K’) are the 
Newtonian Cartesian systems and the transformations (that is the group GK) is the 
group of Euclidian Orthogonal transformations. In the Theory of Special Relativity 
the Cartesian systems (set K) are the Lorentz Cartesian coordinate systems and 
the corresponding set of transformations (i.e. the GK) is the Lorentz group. Both 
these groups are subgroups of the general linear group GL(n, R) forn = 3,4 
respectively. In Newtonian Physics the geometric objects corresponding to physical 
quantities are the Newtonian (or Euclidian) tensors and in the Theory of Special 
Relativity the Lorentz tensors. 

The Principle of Covariance in contrast to the Principle of Inertia, does not 
involve experimental procedures and concerns only exchange of information (i.e. 
images) by means of transformations in the geometric space of a theory. We give the 
schematic diagram of Fig. 2.3 (compare with the corresponding diagram of Fig. 2.2). 
Note that the arrow which corresponds to the Principle of Covariance is bidirectional 
(refers to bidirectional communication) and it is different from the single arrow of 
observation (the observer observes the observant but not vice versa!). 


68 2 The Structure of the Theories of Physics 


Characteristic quantity 
of physical phenomenon 


Observation of observer 1 Observation of observer 2 


Geometric picture . a6 Geometric picture 
; Covariance Principle ; 
of observer 1 = > of observer 2 


of theory 


Fig. 2.3. Diagram of principles of covariance 


We also note that the physical quantities of a theory of Physics are those for 
which the diagram of Fig. 2.3 commutes. This means that the image of a physical 
quantity for observer 1 (direct image) must coincide with the image received from 
observer 2 (communicated image) after suitable transformation and vice versa. This 
“locking” of the diagram is the criterion that decides, in a unique way, the “reality” 
or “existence” of a physical quantity in a theory of Physics. 


2.7 Relativity and the Predictions of a Theory 


The geometrization of the Principle of Relativity of a theory of Physics allows the 
mathematical manipulation of the geometric objects, which describe the images of 
the physical quantities created by the theory. Therefore working in the geometric 
space of the images of the theory, it is possible to construct mathematically new 
images which satisfy the Principle of Covariance of the theory. The following 
question arises: 


Do the new mathematically constructed images correspond to images of existing physical 
quantities/phenomena or are they simply consistent mathematical constructions to which 
there do not exist any physical quantities? 


The assessment that they do exist is a prediction of the theory, which can be 
true or false. Therefore, the verification of a prediction of a theory consists in the 
verification of a proposed experimental/measuring procedure (creation of a direct 
image) of a physical quantity by an inertial observer of the theory. In conclusion, 
the verification of a prediction of a theory of physics is an iterative process which 
follows the following algorithm: 


a. We choose a physical phenomenon 

b. We propose an experimental procedure for its measurement by an inertial 
observer of the theory 

c. We compare the image obtained by an inertial observer with the one constructed 
mathematically by the theory 


2.7 Relativity and the Predictions of a Theory 69 


d. If the two images coincide the prediction is verified. If not, then we may propose 
a different method of measurement or discard the prediction as false. 


The prediction and the verification of a prediction by experimental /measuring 
procedures is a powerful tool of Physics and gives it a unique position against all 
other sciences. Essentially the verification of a prediction increases our belief that a 
given theory of Physics is “correct”, in the sense that it describes well the “outside 
world”. At the same time predictions set the limits of the theories of Physics, 
because as we have already remarked, no theory of Physics appears to be? THE 
ULTIMATE THEORY, which can explain and predict everything. However this is a 
difficult and open issue, which need not, and should not, concern us further in this 
book. 


5The title “The Theory of Everything” is obviously erroneous. 


Chapter 3 ®) 
Newtonian Physics on 


3.1 Introduction 


Newtonian Physics is the first theory of Physics which was formulated scientifically 
and became the milestone for the Theory of Special (and General) Relativity. All 
relativistic ideas are hidden in the Newtonian structure; therefore it is imperative 
that before we proceed with the development of the Theory of Special Relativity, 
we examine Newtonian Physics from the relativistic point of view. In the standard 
treatment, the concepts and the structure of Newtonian Physics are distinguished in 
a fundamental (and not formal as it is often stated) way, in two parts: Kinematics 
and Dynamics. Kinematics considers (a) the various structures (e.g. mass point, 
rigid bodies etc.) which experience motion and (b) the substratum (space and 
time) in which motion occurs. Dynamics involves the study of motion by means 
of equations of motion, which relate the causes (forces) with the development of 
motion (trajectory) in space and time. 

The same Kinematics supports many types of forces. For example Newtonian 
kinematics applies equally well to the motion of a mass point in the gravitational 
field and the motion of a charge in the electromagnetic field. In general, kinematics 
lies in the foundations of a theory of motion; it cannot be changed unless the theory 
itself is changed. On the other hand, dynamics can be changed within a given 
kinematic theory of physics, in the sense that one can consider different equations 
for the laws of motion in a given scenario. One such example is the force law 
on a accelerating charge moving in a electromagnetic field, where various force 
equations have been suggested. 

In what follows we present and discuss briefly the concepts which comprise 
the Kinematics and the Dynamics of Newtonian Physics. The presentation is 
neither detailed nor complete because its purpose is to prepare the ground for the 
introduction of the Theory of Special Relativity through the known “relativistic” 
environment of Newtonian Physics, and not the Newtonian Physics per se. Of course 


© Springer Nature Switzerland AG 2019 71 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_3 


72 3 Newtonian Physics 


the presentation which follows is based on the general discussion concerning the 
theories of Physics presented in Chap. 2. 


3.2 Newtonian Kinematics 


The fundamental objects of Newtonian Kinematics are: 


— Mass point 
— Space 
— Time. 


In the following we deal with each of these concepts separately starting with the 
mass point, which is common in Newtonian and relativistic Kinematics. 


3.2.1 Mass Point 


The mass point in Newtonian Kinematics is identified with a geometric point in 
space and its motion is characterized completely by the position vector r at every 
time moment ¢ in some coordinate system. The change of position of a mass point in 
space is described with a curve r(t), which we call the trajectory of the mass point. 
The purpose and the role of Kinematics is the geometric study of this curve. This 
study is achieved by the study of the first dr/dt and the second d7r/dt? derivatives! 
along the trajectory. At each point, P say, of the trajectory these quantities we call 
the velocity and the acceleration of the mass point at P. 

In Dynamics the mass point attains internal characteristics, which correspond to 
physical quantities e.g. mass, charge etc. These new physical quantities are used 
to define new mixed physical quantities (e.g. linear momentum), which are used 
in the statement of the laws of Dynamics. The laws of Dynamics are mathematical 
expressions which relate the generators of motion (referred to with the generic name 
forces or dynamical fields) with the form of the trajectory r(t) in space. Newton’s 
Second Law: 


‘Tt is a standard result that given the first and the second derivatives of a (smooth) curve in a 
Euclidian (finite dimensional) space at all the points of the curve, it is possible to construct the 
curve through any of its points. This is the reason we do not need to consider higher derivatives 
along the trajectory. 


3.2 Newtonian Kinematics 73 


relates the quantities: 


F = Cause of motion 


d*r 
re = Form (curvature) of the trajectory r(t) in space 
; d’r 
m = Coupling coefficient between F and a 


In this law the force F is not specified and depends on the type of the dynamical 
field which causes the motion and the type of matter of the mass point. For example 
if the mass point has charge q, inertial mass m and moves in an electromagnetic 
field (EK, B) then the force F has the form (Lorentz force): 


F=kq(E+v x B) 


where k is a coefficient, which depends on the system of units, and v is the velocity 
of the mass point. 


3.2.2 Space 


In Newtonian Physics space is defined as follows: 


Absolute space, in its own nature, without regard to anything external, remains always 
similar and immovable. 


This definition was satisfactory at the time it was given, because it had the neces- 
sary philosophical support allowing a theocratical perception of the world. However 
it is not satisfactory today, when science has made so many and important steps into 
the knowledge of the structure of matter and the processing of information. This fact 
was recognized by Einstein (and others), who emphasized that the concept of space 
must be brought “down to earth” and become a fundamental tool in the theory and 
the laboratory.* Therefore we have to give a “practical” answer to the question: 


What proprieties has the physical system space? 


This definition gave Newton in his celebrated book “Principia”. For a more recent reference 
see Arnold Sommerfeld, “Lectures in Theoretical Physics, Volume I, Mechanics” Academic Press 
(1964), page 9. 


3In Einstein’s own words “... The only justification for our concepts and system of concepts is that 
they serve to represent the complex of our experiences; beyond this they have no legitimacy. I am 
convinced that the philosophers have had a harmful effect upon the progress of scientific thinking 
in removing certain fundamental concepts from the domain of empiricism, where they are under 
our control, to the intangible heights of a priori. For even if it should appear that the universe 
of ideas cannot be deduced from experience by logical means, but is, in a sense, a creation of the 
human mind, without which no science is possible, nevertheless the universe of ideas is just as little 
independent of the nature of our experiences as clothes are of the form of the human body. This 
is particularly true of our concepts of time and space, which physicists have been obliged by the 
facts to bring them down from the Olympus of the a priori in order to adjust them and put them in 


74 3 Newtonian Physics 


Fig. 3.1 Fundamental parts . 
and structure / PP f J 0 ye ff 9 


(a) (b) 


The answer we shall give will not conflict with the previous “classical” definition, 
but will make it clear and place it on the chessboard of today’s knowledge and 
experience. 

A physical system can have two types of properties: 


¢ Properties which refer to the character of the various fundamental units (i.e. parts) 
of the system 

e Properties which concern the interaction among the fundamental units of the 
system and comprise what we call the structure of the system. 


For example according to this point of view, the simple pendulum of Fig. 3.1a is 
the following system: 

Fundamental parts: (a) Mass point of mass m and (b) Supporting string of length 
l=]h+h. 

Structure: The mass m is attached at the one end of the supporting string, whose 
other end is fixed. 

The structure can be different although the fundamental parts remain the same. 
For example with the same parts we can construct the pendulum of Fig. 3.1b, whose 
physical properties (that is, motion — e.g. period of oscillation — in the gravitational 
field) are different from those of the simple pendulum. 

Let us consider now the system space, which is our concern. 

We define the fundamental units of the system space to be the points. The points 
are not defined in terms of simpler entities and in this respect are self-sufficient 
with respect to their existence. According to Newtonian Physics the points have no 
quantity and quality and they do not interact in any of the known ways with the rest 
of the physical systems. Due to this latter property, we say that the points of space 
are “absolute”. 

The structure of space is expressed with mathematical relations among its points. 
These relations constitute a complete and logically consistent structure, in which we 
give the generic name geometry. The set of points of a given space, admits infinite 
many geometries, in exactly the same way that the fundamental parts of the simple 
pendulum can create infinitely many systems by different combination of the lengths 
ly, lo. 


a serviceable condition....” Extract from the book “The meaning of Relativity” by Albert Einstein 
Mathuen, Sixth Edition London (1967). 


3.2 Newtonian Kinematics 75 


Newtonian Physics demands that whichever geometry is selected for the structure 
of the space, this is also “absolute”, in the sense that the relations which describe it 
do not change (are not affected) by the physical phenomena (not only the motion!) 
occurring in space. In conclusion: 


Space is comprised of absolute fundamental parts (the points) and has an absolute structure 
(geometry). 


This is what we understand with the phrase of the classical definition “space is 
absolute’. 

Newtonian Physics makes fundamental assumptions, which concern the general 
characteristics of Newtonian observation. According to our discussion in Chap. 2, 
these assumptions are necessary for observation to be a well defined procedure. In 
Newtonian Physics these assumptions are: 


1. The position of a mass point in space is described by the position vector, which 
is a vector in the linear space R*. Consequently the space of Newtonian Physics is 
areal three dimensional space and the coordinate systems of Newtonian Physics 
involve three (real) coordinate functions. 

2. The position of the origin of a Newtonian coordinate system in space is 
immaterial, or equivalently, all points of space can serve as origin of a Newtonian 
coordinate system. In practice this means that the point of space where an 
experiment or observation is made, does not affect the quality or the quantity 
of the physical phenomenon under consideration. This property of space we call 
homogeneity of space and it is fundamental in the development of Newtonian 
Physics. 

3. In space there exist motions whose trajectories are straight lines (in the sense of 
geometry) described by equations of the form*: 


r=at+b 


where t € R and a, b € R°. This type of motions we call inertial. 

4. In Newtonian Physics inertial motions can occur in all directions. In practice this 
implies that the basis vectors of a Newtonian frame can have any direction in 
space. This property of space we call isotropy. 

5. An inertial motion is preserved in time if no external causes (i.e. forces) are 
exerted on the moving mass point. Therefore the straight lines extend endlessly 
and continuously in space. This property means that the space is flat, or 
equivalently, its curvature is constant and equals zero. 


4Tn geometry the linear spaces which admit straight lines are called affine spaces. 


716 3 Newtonian Physics 


The above assumptions specify the environment where the Newtonian measure- 
ments and observations will take place. They do not define the geometry of the 
space, which will be defined by the Principle of Relativity of the theory”. 

We continue with the other fundamental concept of Newtonian Physics, the time. 


3.2.3 Time 


In Newtonian Physics time is understood as follows®: 


Absolute time, and mathematical time, by itself, and from its own nature, flows equally 
without regard to anything external and by another name it is called duration. 


As was the case with the definition of space, the above definition of time is 
also not usable in practice. In order to arrive at a concrete and useful definition we 
follow the same path we took with the concept of space. That is, we are considering 
time as a physical system which has fundamental parts and a structure defined 
by relations among these parts. Such elements must exist because according to 
Newtonian Physics, time exists. 

We consider the fundamental parts of time to be points, which we call “time 
moments” or simply “moments”. They have no quantity or quality (as the points 
of space). Concerning the structure (that is the geometry) of time, we demand that 
the set of moments has the structure of a one dimensional real Euclidian space. 
Mathematically this assumption means that: 


1. The time moments are described by one real number only 

2. All real numbers, which correspond to the time moments, cover completely the 
real line with its standard Euclidian structure. This line we call “straight line of 
time” or “cosmic straight line of time”. 


In practice the above imply that in order to define a system for the measurement 
of time (that is, a coordinate system in the “space” of time) it is enough to consider 
an arbitrary physical system, which has the ability to produce numbers. These 
physical systems we shall call with the generic name clock. There are numerous 
clocks in nature. Let us mention some of them: 


1. The sun, whose mass is continuously reduced. The value of the mass of the sun 
provides numbers in a natural manner therefore can be considered as a clock. 

2. The position of the Earth on its orbit. Every point of the trajectory of the Earth 
along the ecliptic is specified by a single number (the arc length from a reference 
point). These numbers can be used to define the coordinate of time. This is 
done in practice and the resulting time is known as the calendar time. Similar 
measurements of time we define with other elements of the motion of the Earth. 


5In fact the above “properties” define the geometry of a (Riemannian) space of constant vanishing 
curvature. However this geometry can be either Euclidean or Lorentzian therefore we need more 
assumptions in order to select a unique geometry. 


6See Footnote 2. 


3.2 Newtonian Kinematics 77 


For example, the daily rotation of the Earth defines the astronomical time, the 
average value of the calendar time of two successive passages of the sun over the 
same point of the Earth during a calendar year gives the average solar time etc. 


. The mass of a radioactive material e.g. the content in C4 is used in radiometry to 


measure the age of archaeological findings. Similarly, we have the atomic clocks 
which measure time with great accuracy. 


. We can also define a time coordinate by mechanical systems. Such systems are 


all types of mechanical clocks used for many years in everyday life. 


. Finally, we mention the quartz crystals which produce numbers by their oscilla- 


tions under certain conditions. 


From the above examples we see that clocks exist in all areas of Physics, from the 


motion of the planets to the radioactivity of the nucleus or the Solid State Physics. 
This is a noticeable fact, which emphasizes the universality of time in the cosmos 
of Newtonian Physics. 


It is important that we differentiate between the concepts of clock and time. 


Indeed: 


A clock is any physical (material) system, which produces single numbers. 
Clocks do not have an absolute character and they react with their environment. 
Time is absolute. 

Clocks are considered as good, bad etc. whereas time has no quality. A “good” 
clock is one which produces numbers which flow equally and furthermore it 
is not effected substantially during its operation by the external environment. 
According to this view, an atomic clock is considered to be better than a 
mechanical clock. Obviously the better a clock is, the closer represents the 
concept of absolute time. 


We return to the definition of time and specifically to the part flows which we have 


not considered yet. The concept of flow is related with the concept of direction. This 
defines the so called arrow of time and has the following meaning: 


— Geometrically, it means that the cosmic line of time is oriented. Therefore the 


cosmic straight line of time is the real axis and not simply the real line. 
Physically, it means that in Newtonian Physics the physical systems age with the 
same rate, irreversibly, irrevocably and independently of their choice. This is so, 
because time being absolute and universal affects everything and it is affected 
from nothing. Therefore at every moment there exists the past and the future, 
and they are independent of each other, however common to all physical systems 
in the cosmos. The demand of the direction in the “flow” of time restricts the 
physical systems, which can be used as clocks, because the numbers they produce 
must appear continuously and in ascending order. 


We examine now the concept equally in the definition of time. 
Geometrically it means that the direction of the arrow of time is constant, that is, 


independent of the point of time. With this new assumption the cosmic straight line 


78 3 Newtonian Physics 


of time is identified naturally’ with the line of real numbers as we use it, with origin 
at the number zero (which corresponds to the present). Then the positive numbers 
correspond to the “future” and the negative numbers to the “past”. We note that the 
concepts future and past are relative to the present, which is specified arbitrary. 

Physically it means that the clocks we are using for the measurement of the 
Newtonian time, must be unaffected by the environment and, furthermore, to work 
in a way so that they produce numbers continuously, in ascending order and with 
constant rate. The rate has to do with the structure of the clock and not with the 
actual value of the “time rate”, which is the result of the operation of the clock. For 
example, for a mass clock the rate dm must be constant between two numbers, but 
the value of this constant is not prefixed. 

Obviously such clocks do not exist in nature, because no physical system is 
energetically closed. Therefore the requirement for the measurement of Newtonian 
time takes us to the limit of the “ideal clock”, which is independent of everything, 
effected by nothing, absolutely isochronous and above all hypothetical. This ideal 
clock is the universal and absolute governor and in other words, it is the transfer 
of the divine concept in Newtonian Physics. It is important to note that the ideal 
clock does not have a universal beginning® but only a universal rate (rhythm). 
Furthermore, it is the same for all physical phenomena and for every point of the 
space (“everywhere present’’). This clock is absolute and it is the clock of Newtonian 
Physics. 

As we shall see presently, the Theory of Special Relativity abandoned the concept 
of Newtonian time — and accordingly the absolute universal clock — and replaced 
it with the concept of synchronization, which involves two clocks. In that theory, 
there is neither (universal) beginning nor (universal) rate (rhythm) common to 
all observers, but these concepts are absolute only for every and each relativistic 
observer. Naturally the Theory of Special Relativity introduces universal quantities 
with absolute character, but at another level. For the human intelligence, the absolute 
is always imperative for the definition of the relative, which is at the root of our 
perception and understanding of the world. 


7That is without reference to coordinates, but directly point by point. 


8The clock and the Newtonian concept of time had been used by ancient Greeks. However the clock 
of Aristotle had a fixed origin (beginning). Due to this, the Greeks were making Cosmogony and 
not Cosmology. Today we arrived again at the concept of the “beginning” of the Universe with the 
Big Bang theory. However we do Cosmology because the cosmos of Aristotle was constructed once 
and for all, whereas in our approach the universe is constantly changing as a dynamical system. The 
interested reader should look for these fascinating topics in special books on the subject. However 
he/she should be cautious to distinguish between the ‘myth’ and the ‘truth’, whatever the latter 
means. 


3.3. Newtonian Inertial Observers 719 
3.3. Newtonian Inertial Observers 


As we remarked in Chap.2 every theory of Physics has its own “means” of 
observation and “directions” of use. It is due to these characteristics that the study of 
motion is differentiated from theory to theory. Furthermore, the “reality” of a theory 
of Physics is determined by two fundamental principles: 


1. The prescription of the procedure(s) for the observation of the fundamental 
physical quantities by the observers robots of the theory and 

2. The “physical interpretation” of the results of the measurements, expressed by 
the Principle of Relativity of the theory. 


In Newtonian Physics both these principles are self evident, because Newtonian 
Physics has a direct relation with our sensory perception of the world (primary 
sensors), especially so for the space, the time and the motion. In the Theory 
of Special Relativity we do not have this direct sensory feeling of the “reality” 
(secondary sensors), therefore we have to work with strictly prescribed procedures 
and practices, which will define the relativistic physical quantities. Let us discuss 
these two principles in the case of Newtonian Physics. 

In Newtonian Physics it is assumed/postulated that the Newtonian observer is 
equipped with the following three measuring systems: 


1. An ideal unit rod, that is, a one dimensional rigid body whose Euclidian length 
is considered to equal unity, which is unaffected in all its aspects (i.e. rigidity and 
length) by its motion during the measurement procedure.” 

2. A Newtonian gun for the determination of directions. This is a machine which 
fires rigid point mass bullets and it is fixed on a structure which returns the value 
of the direction (e.g. two angles on the unit sphere) of the gun at each instance. 

3. An ideal replica of the absolute clock, which has been set to zero at some 
moment during its existence. 


In Newtonian Physics it is assumed that there is no interaction between the 
measurement of the length and the measurement of time, because the ideal unit 
rod, the Newtonian gun and the ideal clock are considered to be absolute and closed 
to interactions with anything external. 


°A prototype ideal unit rod was kept in Paris at the Institute of Standards. Today the standard unit 
of length and time are defined with atomic rather than with mechanical physical systems. More 
specifically, the standard unit of 1 m is defined as the length of 1,65,076,373 wavelengths of the 
red line in the spectrum of 8° Kr and the standard unit of time interval (1 s) as the time required for 
9,192,631,770 periods of the microwave transfer between the two hyperfine levels of the ground 
state of the isotope of Caesium !°3Cs. 


80 3 Newtonian Physics 


The Newtonian observer uses this equipment in order to perform two opera- 
tions: 


¢ To testify that it is a Newtonian Inertial Observer 
¢ If it is a Newtonian Inertial Observer to measure the position vector of a mass 
point in the physical space and associate with it a time moment. 


3.3.1 Determination of Newtonian Inertial Observers 


In the last section we have seen that Newtonian Physics introduces a special class of 
observers (the Newtonian Inertial Observers) and the reality it creates (that is the set 
of all Newtonian physical quantities) refers only to these observers. However, how 
does one Newtonian observer finds out / determines that it is a Newtonian Inertial 
Observer? 

In order for a Newtonian observer to determine if it!? (no he/she!) is a Newtonian 
Inertial Observer, it applies the following procedure: 

At some nearby point in space places at a fixed position a smooth rigid perfect 

plane which can be rotated freely. Then it shoots a mass bullet towards the plane. 
Let us assume that the bullet hits the plane and it is reflected elastically. There are 
two cases: 
There exists an angle of the mirror for which the bullet returns to the Newtonian gun 
or it does not. In the first case, we say that the direction of firing is a temporary 
inertial direction for the observer. In the second case, we say that the Newtonian 
observer is not a Newtonian Inertial Observer. 

Assume that the Newtonian observer finds a temporary inertial direction. It then 
registers this direction by the direction pointer of its Newtonian gun and repeats 
the same procedure with the purpose to find two more independent instantaneous 
inertial directions. If this turns out to be impossible then the Newtonian observer is 
not an Inertial Newtonian Observer. If such directions are found and they last for a 
period of time then the Newtonian observer is an Inertial Newtonian Observer for 
that period of time. Motions which define Newtonian inertial directions are called 
inertial motions. 

The following questions emerge: 


¢ Do Newtonian inertial directions exist and consequently Newtonian Inertial 
Observers? 

e If they do, why are there at most three independent inertial directions, which a 
Newtonian observer can find experimentally? 


l0We use ‘it’ and not ‘he/she’, because observers in Physics are machines (robots) not humans. 
The identification of the observers with humans is a remnant of Newtonian Physics and the early 
anthropomorphic approach to science, is due mainly to the close relation of Newtonian Physics 
and early science with the sensory perception of physical phenomena. 


3.3. Newtonian Inertial Observers 81 


The answer to both questions is given by the following axioms. 


3.3.1.1 Newton’s First Law 


There do exist Newtonian Inertial Observers or, equivalently there exist Newtonian inertial 
directions in space. 


3.3.1.2 Axiom on the Dimension of Space 


Space has three dimensions. 


The above axioms assess the Inertial Newtonian Observers from the point of 
view of Physics, that is, in terms of physical measurements. However in order the 
Newtonian Inertial Observers to be used for the study of motion it is necessary 
that they will be identified in geometry. Obviously the geometric determination of a 
Inertial Newtonian Observer cannot be done by means of unit rods, clocks and guns 
but in terms of “geometric” objects. We define geometrically the Newtonian Inertial 
Observers by means of the following requirements/characteristics: 


1. The trajectory of a Newtonian Inertial Observer in the three dimensional Euclid- 
ian space of Newtonian Physics is a straight line and its velocity is constant. 

2. The coordinate systems of Newtonian Inertial Observers are the Euclidian 
Coordinate Frames (ECF). Therefore the numbers which are derived by the 
observation/measurement of a Newtonian physical quantity by a Newtonian 
Inertial Observer, are the components of the corresponding geometric object 
describing the quantity (i.e. the Newtonian tensor) in the ECF of the Newtonian 
Inertial Observer. 


The existence of Newtonian Inertial Observers is of rather theoretical value. 
Indeed in practice most observers move non-inertially or, equivalently, most motions 
in practice are accelerated. Then how do we make Physics for accelerated observers? 
To answer this question we generalize the concept of the Newtonian Inertial 
Observer as follows. 

Ata point P (say) along the trajectory of an accelerated Newtonian observer (this 
trajectory cannot be a straight line, except in the trivial case of one dimensional 
accelerated motion) we consider the tangent line, which we identify with the 
trajectory of an Inertial Newtonian Observer. This Newtonian Inertial Observer 
we call the Instantaneous Newtonian Inertial Observer at the point P. With 
this procedure a Newtonian Accelerated Observer is equivalent to (or defines) a 
continuous sequence of Inertial Newtonian Observers, each observer with a different 
velocity. Then the observations at each point along the trajectory of an accelerated 
observer are made by the corresponding Newtonian Inertial Observer at that point. 


82 3 Newtonian Physics 
3.3.2 Measurement of the Position Vector 


The primary element which is measured by all theories of Physics studying motion 
is the position of the moving mass point at every instant in physical space. 
This position is specified by the position vector in the geometric space where 
the theory studies motion. In Newtonian Physics motion is studied in the three 
dimensional Euclidian space; therefore the position vector is a vector r(t) in that 
space. The procedures of the observation/measurement of the instantaneous position 
in Newtonian Physics is defined only for the Newtonian Inertial Observers, and it is 
the following. 

Consider a Newtonian Inertial Observer who wishes to determine the position 
vector of a point mass moving in space. The observer points his Newtonian gun 
at the point P and fires an elastic bullet. The bullet moves inertially — because the 
observer is a Newtonian inertial observer — and it is assumed that it has infinite speed 
(action at a distance). Assume that the bullet hits the point P where it is reflected 
by means of some mechanism and returns to the gun of the observer along the 
direction of firing. The observer marks this direction and identifies it as the direction 
of the position vector of the point P. In order to determine the length of the position 
vector the observer draws the straight line connecting the origin of his coordinate 
system, O say, with the point P. Then the observer translates the ideal unit rod 
along this straight line OP (this procedure is called superposition) and measures 
(as it is done in good old Euclidian Geometry) its length. The observer identifies the 
number resulting from this measurement with the length of the position vector of 
the point P. 

There remains still the measurement of the time coordinate. Newtonian Physics 
assess that the measurement of time is done simply by reading the ideal clock 
indication at the moment of competing the measurement of the position vector. 
Because in Newtonian theory time and space are assumed to be absolute, the 
independent measurement of the space coordinates and the time coordinate of 
the position vector is compatible and therefore acceptable. This completes the 
measurement of the position vector by the Newtonian Inertial Observer. In the 
following we shall write NIO for Newtonian Inertial Observer. 


3.4 Galileo Principle of Relativity 


With the measuring procedure discussed in the last section every NIO describes the 
position vector of any mass point with four coordinates in the coordinate frame it 
(not he/she) uses. However, as we have remarked in Chap. 2 the measurements of 
one NIO have no physical significance if they are not verified with the corresponding 
measurements of another NIO. This verification is necessary in order the measured 
physical quantity to be “objective” that is, independent of the observer observing 


3.4 Galileo Principle of Relativity 83 


it. According to what has been said in Chap.2, the procedure of verification is 
established by a Relativity Principle, which specifies: 


e An internal code of communication (= transformation of measurements/images) 
between NIOs, which establishes the existence Newtonian physical quantities. 

¢ The type of geometric objects which will be used for the mathematical descrip- 
tion of the images of the Newtonian physical quantities. 


The Principle of Relativity in Newtonian Physics is the Galileo Principle of 
Relativity and it is described by the diagram of Fig. 3.2, which is a special case of 
the general diagram of Fig. 2.2. 

The Galileo Principle of Relativity is of a different nature than the direct 
Newtonian observation, because it relates the images of one NIO with those of 
another NIO and not an observer with an observed physical quantity. For this reason 
the arrow which represents the Galileo Principle is bidirectional (mutual exchange 
of information). 

More specifically, let us consider the NIO O and let K be his Euclidian Cartesian 
coordinates in space. Consider a second NIO O’ with the Euclidian Cartesian 
coordinate system K’. Then the Galileo Relativity Principle says that there exists 
a transformation which relates the coordinates of all points in space in the systems 
K, K’. We demand that this coordinate transformation be linear and will depend 
only on the relative velocity of the coordinate frames K, K’. 

We note that the triangle formed by the various arrows is closed, which means 
that if a Newtonian physical quantity is observed by one NIO, then the physical 
quantity can be observed by all NIOs. Finally, the triangle commutes, which means 
that in order to describe the motion of a mass particle, it is enough to use one NIO 
to perform the measurement of the position vector, and then communicate the result 
to any other NIO by means of the proper Galileo coordinate transformation. 


Physical Quantity 
Motion of Mass Point 


Newtonian Observation Newtonian observation 
Measurement of Newtonian Calil Measurement of Newtonian 
: alileo . 
Inertial Observer 1 = + Inertial Observer 2 


Principle of Relativity 


Fig. 3.2. The Galileo relativity principle 


84 3 Newtonian Physics 


3.5 Galileo Transformations for Space and Time: Newtonian 
Physical Quantities 


The transformation of coordinates specified by the Principle of Relativity of a 
theory of Physics, is a well defined mathematical procedure if it leads to a set of 
transformations which: 


e They form a group under the operation of composition of maps 
¢ Each transformation is specified uniquely in terms of the relative velocity of the 
observers it relates. 


In Newtonian Physics the Galileo Principle of Relativity leads to the Galileo 
transformations. These transformations relate the coordinate systems of NIOs and 
they define the Galileo group!! under the composition of transformations. 

In order to compute the Galileo transformations, one needs a minimum number of 
fundamental Newtonian physical quantities, which will be the basis on which more 
Newtonian physical quantities will be defined. The space and the time must be the 
first fundamental Newtonian physical quantities, because they are the substratum on 
which motion is described and studied. Therefore the first demand/requirement of 
the Galileo Principle of Relativity is: 


3.5.1 Galileo Covariant Principle: Part I 


For NIO the position vector and the time coordinate are Newtonian physical quantities. 


In order to compute the analytic expression of the Galileo transformations, we 
consider the ideal unit rod and let A, B be its end points whose position vectors 
in K, K’ are respectively ra, rg and V4 rp. Then the Galileo Relativity Principle 
specifies the exchange of information between the NIO K, K’ by the requirements: 


3.5.2 Galileo Principle of Communication 


(a) The Euclidian distance of the points A and B is an invariant. 
(b) The time moment of the points A, B is the same and it is also an invariant. 


These two requirements when expressed mathematically lead to the following 
equations: 


(AB) = (AB)’, t=1' 


'IThis is the isometry group of the three dimensional Euclidian metric. This metric has been 
introduced silently in our assumption of the existence of the rigid rods. In Special Relativity there 
do not exist (in general) rigid bodies. 


3.6 Newtonian Physical Quantities: The Covariance Principle 85 


or, 


(r4 — 1g)? = (ral— rp)” (3.1) 


Pa7, (3.2) 


Equation (3.1) is the equation of Euclidian isometry, which we studied in Sect. 1.5. 
Therefore Galileo transformations are the group SO(3) of Euclidian Orthogonal 
Transformations (EOT), which in two general, not necessarily orthogonal frames, 
K, K’ are given by the equation: 


A'[g]x, A = [glx)- (3.3) 
Especially for a Euclidian Cartesian Coordinate system (and only there!) [g]x, = 
[g]x, = 13,where J3 is the unit 3 x 3 matrix 6,,) and the transformation matrix A 
satisfies the orthogonality relation: 


AtA=h. (3.4) 


Under the action of the Galileo transformations the position vector it transforms 
as follows: 


r= Ar+O/0. (3.5) 


In this relation r is the position vector as measured by the NIO O, r’ is the 
position vector (of the same mass point!) as measured by the NIO O’ and A is 
the Galileo transformation relating the observers O, O’. A vector in E? describes a 
Newtonian physical quantity if and only if under the action of a Galileo coordinate 
transformation transforms as in equation (3.5). 

The time transformation equation (3.2) simply says that time is an invariant 
of the group of Galileo transformations. In conclusion, the Galileo Principle of 
Relativity provides us with one Newtonian vector quantity (the position vector) and 
one Newtonian scalar quantity (the time). In the next section we show how these 
two fundamental Newtonian physical quantities are used to define new ones. 


3.6 Newtonian Physical Quantities: The Covariance 
Principle 


The Galileo Principle of Relativity introduced the Newtonian physical quantities 
position vector and time. However in order to develop Newtonian Physics one 
needs many more physical quantities. Therefore we have to have a procedure, which 
will allow us to define additional Newtonian physical quantities. This procedure is 
established by a new principle, called the Principle of Covariance, whose general 
form has been given in Sect. 2.6.2 of Chap. 2. 


86 3 Newtonian Physics 


3.6.1 Galileo Covariance Principle: Part I 


The Newtonian physical quantities are described with Newtonian tensors. 


Mote specifically the Newtonian physical quantities have 3”, n = 0,1,2,... 
components, which under the Galileo transformations transform as follows: 


aa ’ / if at 
Tey Hy bay Ay A AL oo. THIP2K3 (3.6) 


where TMi’ M23") THIH2H3"" are the components of the Newtonian physical 
quantity T as measured by the NIO O’ and O respectively and AY (i, i’ = 1,2, 3) 
is the Euclidian Orthogonal transformation relating K’, K and defined by the 
transformation of the connecting vector(!). 

Obviously for n = 0 one gets the Newtonian invariants, form = | the Newtonian 
vectors etc. The Galileo Covariance Principle does not say that all Newtonian 
tensors are Newtonian physical quantities. It says only that a Newtonian tensor is 
a potential Newtonian physical quantity and it is Physics which will decide if this 
quantity is indeed a physical quantity or not! More on that delicate subject we shall 
say when we discuss the Theory of Special Relativity. 

From given Newtonian tensors we define new ones by the general rules stated in 
Sect. 1.4.1: 


RuleI Jf we differentiate a Newtonian tensor of order (r,s) wrt a Newtonian 
invariant then the new geometric object we find is a Newtonian tensor of order 


(r,s). 


Rule II [f we multiply a Newtonian tensor of order (r,s) with a Newtonian 
invariant then the new geometric object we find is a Newtonian tensor of order 


(r,s). 


3.7 Newtonian Composition Law of Vectors 


The composition of Newtonian vectors is vital in the study of Newtonian Physics, 
however many times it is approached as a case by case matter, thus loosing its deeper 
geometric significance. The commonest rule of composing vectors in Newtonian 
Physics is the composition of velocities. Indeed the composition of velocities is 
a simple yet important issue of Newtonian Kinematics and constitutes one of the 
reasons for the introduction of the Theory of Special Relativity. 

We consider a point P with position vector r and r’ wrt to the Cartesian 
coordinate systems K and K’ of the NIO O, O’ respectively. The linearity of the 
space implies the relation: 


r—OO' =r’. (3.7) 


3.7 Newtonian Composition Law of Vectors 87 


The left hand side of relation (3.7) contains the vectors r, OO’, which are 
measured by the observer O and in the right hand side the vector r’ which is 
measured by observer O’. The linearity of space implies that these two “different” 
conceptions of the point P are the same. We differentiate (3.7) wrt the time f of 
observer O and find: 


dr dOO'— dr 


3.8 
dt dt dt 38) 
or, in terms of the velocities: 
Waves (3.9) 
E> 005 : 


In equation (3.9) Vp, Voo’, are the velocities of P and O’ as measured by the 
observer O. The quantity a in the right hand side is not a velocity, because the 
time f is not the time of the observer O’. 


Let t’ be the time of observer O’. Then (3.9) is written as: 


dt’ 
Vp—Voo' = rae (3.10) 


where V’, is the velocity of P as measured by observer O’. However, in Newtonian 


Physics t’ = t (because time is an invariant!) therefore a = I, and relation (3.10) 


gives the following law of composition of velocities in Newtonian Physics: 
Ve —Voo' = V>. (3.11) 


Now relation (3.11) is the Galileo transformation of the vector Vp from the 
coordinate system K of O to the coordinate system K’ of O’, that is: 


K — K' 
Vp => Vi = Vp —Vogo (3.12) 


where Vg is the velocity of O’ as measured by O. 
Having the above as a guide we define the composition law of a Newtonian vector 
quantity Ap, say, observed by the NIO O and O’ with the relation: 


Ap — Ab = Ap — Ago: (3.13) 
where Agog’ is the corresponding vector A of O’ as it is measured by O. 
Equivalently, if ['(O, O’) is the Galileo transformation relating O, O’ we may 


define the composition law of the Newtonian vector Ap with the relation: 


> =T(O, O')Ap. (3.14) 


88 3 Newtonian Physics 


For example if ap (respectively a’, ) is the acceleration of the point P as measured 
by the observer O (respectively O’) and Ag is the acceleration of observer O’ as 
measured by observer O the: 


ap = T(O, O’)ap = ap — ago’. (3.15) 


We conclude that the law of composition of the Newtonian vector quantities is 
equivalent to the Galileo transformation. This is the reason why the discovery that 
the velocity of light did not obey this composition law, resulted in the necessity 
of the introduction of a new theory of motion, which was the Theory of Special 
Relativity. 


3.8 Newtonian Dynamics 


Newtonian kinematics is concerned with the geometric study of the trajectory of a 
mass point in space, without involving the mass point itself. For this reason the 
Newtonian physical quantities which characterize the trajectory of a mass point 
are limited to the position vector, the velocity and the acceleration. Newtonian 
Dynamics is the part of Newtonian Physics which studies motion including the mass 
point. In order to make this possible one has to introduce new physical quantities, 
which characterize the mass point itself. 

The simplest new Newtonian physical quantities to be considered are the 
invariants, the most basic being the inertial mass m of the mass point. The mass 
(inertial) m is the first invariant Newtonian physical quantity after the time and 
connects the Kinematics with the Dynamics of the theory. 

We recall that a Newtonian invariant is a potentially Newtonian physical quantity 
and becomes a physical quantity only after a definite measuring or observational 
procedure has been given. Therefore if we wish to have a mass m associated with 
each mass point, we must specify an experimental procedure for its measurement. 
This is achieved by the introduction of a new physical quantity, the linear momen- 
tum. 

For a mass point of mass m we define the vector p = mv, which we name 
the linear momentum of the mass point. Because v is a Euclidian vector and 
m a Euclidian invariant the quantity p is a Euclidian vector, therefore a potential 
Newtonian physical quantity. For this new vector we define a conservation law 
which distinguishes it from other Newtonian vectors (otherwise p would be an 
arbitrary vector without any physical significance). 


3.8 Newtonian Dynamics 89 


3.8.1 Law of Conservation of Linear Momentum 


A physical system consisting of Newtonian mass points (which are not assumed to interact 
with gravitational forces) with linear momentum p; will be called closed if the following 
equation holds: 


Sy Pi = constant. (3.16) 


Relation (3.16) is the mathematical expression of the law of conservation of 
linear momentum. The word “law” means that equation (3.16) has been proved true 
in every case we have applied it so far, however we cannot prove it in general. We 
remark that this law holds for each and all NIO and furthermore that it is compatible 
with the Galileo Principle of Relativity. 

The deeper significance of the law of conservation of linear momentum is that 

it makes the linear momentum from a potentially Newtonian physical quantity to a 
Newtonian physical quantity. This is done as follows: 
We consider two solid bodies (properly chosen e.g. spheres) which rest on a smooth 
frictionless surface connected with a spring under compression and a string which 
keeps them in place. At one moment we cut the string and the bodies are moving 
apart in opposite directions (we assume one dimensional motion). We calculate the 
velocities of the bodies just after the string is cut. 

We number the bodies by | and 2 and consider two NIO O, O’ with relative 
velocity u. Let m,,mz the masses of the bodies wrt the observer O and m' P ms 
wrt observer O’. We assume that the bodies 1,2 initially at rest wrt observer O and 
that after the cutting of the string they have velocities V;, V2 wrt the observer O and 
Vi: V} wrt the observer O’. Conservation of the linear momentum for observers O 
and O’ gives: 


mV, +m2V2. = 0 (3.17) 
—(m'), +m)5)yu = mV + m5V5. (3.18) 


From the Newtonian composition law for velocities we have: 
,=Vi-u Vv=V2-u. 
Replacing in (3.18) we find: 
mV + mV2 = 0. (3.19) 
From (3.17) and (3.19) follows: 


, 
Bi (3.20) 


/ 


90 3 Newtonian Physics 


which implies that the quotient of the inertial masses of the bodies SS is a Euclidian 
invariant. This invariant is a Euclidian physical quantity because it can be measured 
experimentally. Therefore if the mass of body 1 (say) is k times the mass of body 2 
for one NIO then this is true for all NIO. By choosing one body to have unit mass for 
one NIO (therefore for all NIO because we assume mass to be an invariant) we have 
an experimental measurement for mass. Therefore mass is a Newtonian physical 
quantity. 

Using the mass of a mass point and the acceleration we define the new potentially 
Newtonian physical quantity: 


ma. 


The Second Law of Newton says that this new quantity is a Newtonian physical 
quantity, which we call force: 


F=ma. (3.21) 
A subtle point which was pointed out for the first time by Special Relativity is 


the covariant character of equation (3.21). That is if O’ is another NIO then we 
demand that: 


d / 
y= °P (3.22) 
dt 
where F’, p’ (of course t’ = t) are the physical quantities F, p as measured by 


O’. This requirement says that in addition to the demand that the Newtonian 
physical quantities must be expressed in terms of Newtonian tensors, the dynamical 
equations must also be tensor equations, that is covariant under the action of the 
Galileo transformations. This requirement, which has been (unfortunately) called 
again Covariance Principle, has been considered at times as trivial, since the right 
hand side of equation (3.22) defines a Newtonian physical quantity. However this 
point of view is not correct because at this elementary level everything is profound. 
However in more general cases (field theory) the Principle of Covariance attains a 
practical significance in the formulation of the dynamical equations. Closing, we 
emphasize that what has been said for equation (3.22), applies to all dynamical 
equations of Newtonian Physics, whatever dynamical Newtonian physical quantities 
they involve. 


Chapter 4 Mm) 
The Foundation of Special Relativity al 


4.1 Introduction 


In Chap.3 we developed Newtonian Physics from the relativistic point of view. 
We have discussed the deeper role of the Galileo’s Principle of Relativity and 
the concept of Newtonian physical quantity. In this chapter we shall use these 
Newtonian relativistic concepts to formulate the Theory of Special Relativity. We 
emphasize that concerning their structure both Newtonian Physics and Special 
Relativity are similar, the difference between the two theories being in the method 
of position measurement. 

Why did the Theory of Special Relativity need to be introduced? How long it 
took for the theory to be developed and when this took place? For a detailed answer 
to these and similar questions the interested reader should consult the relevant 
literature. In the following we shall refer briefly to some historic elements mainly for 
the conceptual understanding and the historic connection. Newtonian Physics was at 
its summit by the end of the nineteenth century when the mechanistic (deterministic) 
conception of the world was the prevailing trend in science. Indeed it was believed 
that one could provide the future and the past of a system (not necessarily a system 
of Physics) if one was given the present state of the system and its “equation 
of evolution”. It was believed that everything was predetermined concerning the 
development in space and time, a point of view which was in agreement with 
the view of absolute time and absolute space. More specifically it was believed 
that all physical phenomena were described by Newtonian physical quantities and 
interactions among these quantities. ! 

During the second part of the nineteenth century a number of experiments, 
carried out mainly by Michelson and Morley, were indicating that there was 


'“The stone age did not finish because they run out of stones; but because it was found the iron 
which was a better solution to the stone.” 


© Springer Nature Switzerland AG 2019 91 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_4 


92 4 The Foundation of Special Relativity 


a fundamental physical system which could not be described by a Newtonian 
physical quantity. That was the light whose kinematics was appearing to be at odds 
with Newtonian kinematics. Specifically the speed of light was appearing to be 
independent of the relative velocity of the emitter and the receiver. 

As it is expected this radical result could not be accepted by the established 
scientific community and physicists tried to manipulate the Newtonian theory with 
the view to explain the “disturbing” non-Newtonian behavior of the speed of 
light. However the changes were not considered at the level of the foundations 
of Newtonian Physics (i.e. Galileo Principle of Relativity, Newtonian physical 
quantities etc.) but instead they focused on the modification of the properties of 
physical space. A new cosmic fluid was introduced, which was named ether to 
which were attributed as many properties were required in order to explain the newly 
discovered non-Newtonian phenomena, with the constancy of the speed of light 
preeminent. Soon the ether attained too many and sometimes conflicting properties 
and it was becoming clear that the “new” phenomena were just not Newtonian. 

Poincaré understood that, and began to consider the existence of non-Newtonian 
physical quantities, a revolutionary point of view at that time. He came very close to 
formulating the Theory of Special Relativity. However the clear formulation of the 
theory was made by A. Einstein with his seminal work” (‘On the electrodynamics 
of moving bodies’ Annalen der Physik, 17 (1905)). 

In the following we present the foundation of Special Relativity starting from the 
non-Newtonian character of the light and subsequently stating Einstein’s Principle 
of Relativity and the definition of the relativistic physical quantities. 


4.2 Light and the Galileo’s Principle of Relativity 


4.2.1 The Existence of Non-Newtonian Physical Quantities 


In Newtonian Physics the Galileo Principle of Relativity is a consequence of the 
direct sensory experience we have for motion. Due to this, the Principle is “obvious” 
and need not be considered at the foundations of Newtonian Theory but only after 
the theory has been developed and one wishes to differentiate it from another theory 
of Physics. 

A direct result of the assumptions of Newtonian Physics concerning the space 
and time is the Newtonian law of composition of velocities. The special attention 
which has been paid to that law is due to historical reasons. Because of this law 


2An English translation of this paper can be found in the book Principle of Relativity by H. Lorentz, 
A. Einstein, H. Minkowski, H. Weyl, Dover (1952). In this volume one can find more papers which 
lead to the development of the Theory of Special Relativity including the work of Einstein where 
the famous relation E = mc? appeared for the first time as well as the introduction of the term 
spacetime by H. Minkowski. 


4.2 Light and the Galileo’s Principle of Relativity 93 


it was shown for the first time that there are physical quantities which cannot 
be described with Newtonian physical quantities or, equivalently, which are not 
compatible with the Galileo Principle of Relativity. In practice this means that one 
cannot employ inertial Newtonian Observers to measure the kinematics of light 
phenomena with ideal rods and ideal clocks and then compare the measurements 
of different observers using the Galileo transformation. One could argue that we 
already have a sound theory of optics which is widely applied in Newtonian Physics. 
This is true. However one must not forget that the theory of Newtonian Optics 
concerns observers and optical instruments which have very small relative velocities 
compared to the speed of light, therefore the divergency of the measurements 
are much smaller than the systematic errors of the instruments. For observers 
with relative speed comparable to the speed of light (e.g. greater than 0, 8c) the 
experimental results differentiate significantly and are not compatible with the 
Galileo Principle of Relativity. Due to the small speeds we are moving, these results 
do not trigger the direct sensory experience and one needs special devices to do that, 
which in Chap. 2 we called secondary sensors. 

Finally, today we know that at high relative speeds it is not only the light 
phenomena which diverge form the Newtonian reality but all physical phenomena 
(e.g. the energy of elementary particles). It is this global change of Physics 
at high speeds which is the great contribution of Einstein and not the Lorentz 
transformation, which was known (with a different context) to Lorentz and Poincaré. 
Normally only atomic and nuclear phenomena are studied at these speeds; therefore 
these are the phenomena which comprise the domain of Special Relativity. 


4.2.2 The Limit of Special Relativity to Newtonian Physics 


Our senses (primary sensors) work in the Newtonian world, therefore every theory 
of Physics eventually must give answers in that world, otherwise it is of no use. 
For this reason we demand that the new theory we shall develop below, in the limit 
of small relative velocities, will give numerical results which coincide (within the 
experimental accuracy) with the corresponding Newtonian ones, provided that they 
exist. Indeed it is possible that relativistic phenomena do not have a Newtonian limit, 
as for example the radioactivity in which the radioactive source can be at rest in the 
lab. 

The coincidence of the results of Special Relativity with the corresponding 
results of Newtonian Physics in the limit of small relative velocities, must not be 
understood as meaning that at that limit the two theories coincide. Each theory has 
its own distinct principles and assumptions which are different from those of the 
other theory. The limit concerns only the numerical values of the components of 
common physical quantities. 

The requirement of the limit justifies the point of view that there is a continuation 
in nature and the phenomena occurring in it and, in addition, enables one to identify 


94 4 The Foundation of Special Relativity 


the physical role (in our Newtonian environment) of the physical quantities of 
Special Relativity. Schematically these ideas are presented as follows: 


of high 


Physi 
ysics | 
speeds 


Newtonian 
Physics 


Exercise 4.2.1 Define the variable | = ct and consider the D’ Alembert operator: 


=V?-—. (4.1) 


Let V* be a four-dimensional linear space and let (1, x, y, z), (l', x’, y’, z’) two 
coordinate systems in V*, which are related by the linear transformation: 


x=ax'+bl' , y=y', z=2z', l= bx’ +al'. (4.2) 


1. Show that under this coordinate transformation the D’ Alembert operator 
transforms as follows: 


2. Show that the D’ Alembert operator transforms covariantly (that is 0? = O?) 


if the coefficients a, b satisfy the relation/constraint a* — b? = 1. One solution 
of this equation isa = y,b = By, y = (1 ~ py ?, B € (0,1). Write 
the resulting coordinate transformation in the space V* using these values 
of the coefficients and the general form (4.2). As will be shown below this 
transformation is the boost along the x— axis and is a particular case of the 
Lorentz transformation. 

3. The wave equation for the electromagnetic field @ is 0? = 0. Show that the 
wave equation is not covariant under a Galileo transformation and that it is 
covariant under a Lorentz transformation. This result shows that light waves are 
not Newtonian physical quantities. 


. 8 od , be _.8 3 
(Hint: 59 = gt ax t+ oe WO Fae + OG ete] 


When Einstein presented Special Relativity, people could not understand the new 
relativistic physical quantities, because they appeared to behave in a “crazy” i.e. 
non-Newtonian way. For that reason, Einstein devised simple but didactic arguments 
whose main characteristic is the simple mathematics and the essential Physics. 
These arguments have been called thought experiments (or Gedanken experimente 
in German) and have contributed essentially in the comprehension of the theory. 
One such experiment we present in the following example. 


4.2 Light and the Galileo’s Principle of Relativity 95 


Fig. 4.1 Spontaneous 
emission <—® M e—> 


Example 4.2.1 Consider a train carriage of mass M which is resting on a smooth 
horizontal plane of a Newtonian Inertial observer. Suddenly and without any exter- 
nal cause two small equal masses m (< M/2) are emitted from the train carriage 
in opposite directions with equal speeds (see Fig. 4.1). Prove that this phenomenon 
cannot be explained if one assumes the conservation of mass, conservation of linear 
momentum and the conservation of energy. Furthermore show that the phenomenon 
can be explained provided one assumes: 


(a) The relation E = Amc? where Am is the reduction in mass, E is the kinetic 
energy of the fragments and c is a universal constant (the speed of light) and 
(b) The mass is not preserved. 


Solution 
Assuming conservation of linear momentum we infer that the remaining part of 
the train carriage will continue to be at rest after the emission of the masses m. 
Assuming conservation of mass we have that the mass of the train carriage after 
emission equals M — 2m. 
Concerning the conservation of energy we have that before the emission the 
2p? 


(Eenene) energy of the system equals zero whereas after the emission equals 5— = 


r, where p is the measure of the linear momentum of each mass m. Therefore we 
have a violation of the conservation law of energy. This violation would not bother 
us if no such phenomena exist in nature. However observation has shown that this is 
not so; one such example is the spontaneous disintegration of a radioactive nucleus. 
Therefore we must be able to “explain” such phenomena with theory. 

In order to give an “explanation” of the phenomenon we have to abandon 
one of the three conservation laws and, at the same time, abandon Newtonian 
Physics. Conservation of linear momentum is out of question, because we know 
that it works very well.? Between the conservation of mass and the conservation of 
energy we prefer to abandon the first. Therefore we assume that after the emission 
the (relativistic not the Newtonian!) mass of the remaining train carriage equals 
M—2m-—Am where Am is a correcting factor necessary to keep the energy balance. 
Still we need a relation which will relate the (relativistic not Newtonian!) energy 
with the (relativistic) mass. Of all possible relations we select the simplest, that is 
E = Am where A is a universal constant. A dimensional analysis shows that A 
has dimensions [L]2[T]~2 that is, speed squared. But experiment has shown that a 


3Also it has to do with the linearity of the space. 


96 4 The Foundation of Special Relativity 


universal speed does exist and it is the speed c of light in vacuum. We identify ~ 
with c and write: 


E = Amc’. 


Using this relation the conservation of energy gives: 


2 
Amc? = Za =2EK 
m 
where Ex is the (relativistic) kinetic energy of each fragment in the coordinate 
system in which we are working. This equation shows that part of the mass of the 
train carriage has been changed to kinetic energy of the masses m. This effect allows 
us to say that matter in relativity exists in two equivalent forms (relativistic) mass 


and (relativistic) energy. 


We emphasize that in the above example both mass and energy must be 
understood relativistically and not in the Newtonian context, otherwise one might 
be led to erroneous conclusions. 


4.3 The Physical Role of the Speed of Light 


The speed of light has two characteristics which make it important in the study 
of physical phenomena. As we have already remarked the speed of light is not 
compatible with the velocity composition rule of Newtonian Physics, hence light is 
not a Newtonian physical quantity. This means that we must develop a new theory 
of Physics in which the Newtonian method of measuring space and time intervals, 
that is absolute rigid rods and absolute clocks, do not exist. 

The second property is that the speed of light in vacuum is constant and 
independent of the relative velocity of the emitter and the receiver. This is not a truly 
established experimental fact because all experiments measure the speed of light in 
two ways (“go” and “return’”’) therefore they measure the average round-trip speed. 
However we accept this result as a Law of Physics and wait for the appropriate 
experiment which will validate this Law. 

The second property of light is equally important as the first because while the 
first prohibits the rigid rods and the absolute clocks, the second allows us to define 
new ways to measure space distances and consequently to propose procedures which 
will define relativistic inertial observers and relativistic physical quantities. These 
procedures are known with the collective name chronometry. 

Chronometry defines a new arrow, in analogy with the Newtonian arrow of 
prerelativistic Physics: 


Physical Quantity 
or 
phenomenon 


Chronometry 
ome th 


Observer | : 


4.4 The Physical Definition of Spacetime 97 


Physical Quantity 


Relativistic Observation Relativistic Observation 
Measurement of Relativistic Measurement of Relativistic 
Observer 1 Einstein Principle . Observer 2 


of Relativity 


Fig. 4.2. The Einstein principle of relativity 


The new type of observation defined by chronometry we call relativistic 
observation and the new class of observers relativistic observers. The theory 
of these observers is the Theory of Special Relativity. The new theory needs a 
Relativity Principle to define its objectivity. The new principle we call the Einstein 
Principle of Relativity and it is shown schematically in Fig.4.2 with a diagram 
similar to the one of the Galilean Principle of Relativity (see Fig. 3.2). 

Concerning this diagram we make the following observations: 


(a) The arrow between the two observers (although it has been drawn the same) is 
different from the corresponding arrow of Fig. 3.2. This is due to the fact that 
in Special Relativity there are events which are observable from the relativistic 
observer x but they are not observable from the relativistic observer y and vice 
versa, contrary to what is the case with the Newtonian Physics. 

(b) For the relativistic observation the sharp distinction between the observed and 
the observer still holds. This means that Special Relativity is a classical theory 
of Physics. In the non-classical theories of Physics this distinction does not hold 
and one speaks for Quantum theories of Physics, relativistic or not. 


Finally in Special Relativity the Einstein Principle of Relativity concerns the 
exchange of information between the relativistic observers and does not involve 
the relativistic observation itself. This Principle is quantified by a transformation 
group which defines the covariance group of the theory and subsequently (a) the 
mathematical nature of the relativistic physical quantities and (b) the mathematical 
form of the laws of Special Relativity. 


4.4 The Physical Definition of Spacetime 


Every theory of Physics creates via its observers “pictures” of the physical 
phenomena in a geometric space, the “space” of the theory. In Newtonian Physics 
this space is the Euclidian space E>. In Special Relativity this space is called 
spacetime and is fundamentally different from the space E* of Newtonian Physics. 


98 4 The Foundation of Special Relativity 


The concept of spacetime was mentioned for the first time by H. Minkowski in 
his seminal talk at the 80th Congress of the German Scientists of Physical Sciences, 
which took place in Cologne on 21 September 1908, three years after the celebrated 
work of Einstein on the Theory of Special Relativity. The following words of H, 
Minkowski in that congress are considered to be classical: 


The views of space and time which I wish to lay before you have sprung from the soil of 
experimental Physics, and therein lies their strength. They are radical. Henceforth space by 
itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of 
union of the two will preserve an independent reality. 


In order to define the concept of spacetime (which apparently is a bad word 
because it has nothing to do either with the space or with the time!) we use the 
same methodology we followed with the concepts of space and time in Newtonian 
Physics, that is, we develop the concept of spacetime by defining its parts (points) 
and its structure (geometry). 


4.4.1 The Events 


The points of spacetime we call events. In contrast to the points of Newtonian space 
the events have identity. Indeed we consider that each event refers to something 
that happened to a physical system or systems (the event does not characterize the 
physical system per se). For example consider the disintegration of a nucleus. The 
point of spacetime is the fact that the nucleus disintegrated not where and when this 
happened. The latter are the coordinates of the point of spacetime (the event) in some 
coordinate system. The coordinates can change depending on the coordinate system 
used, however the event remains the same! In a sense spacetime is the aggregate of 
all facts which happened to all physical systems. 

We infer immediately that Newtonian space and Newtonian time are not rela- 
tivistic physical quantities, that is, they have no objectivity in Special Relativity as 
individual entities. Because no event is possible either for the (Newtonian) space or 
for the (Newtonian) time, because they are absolute i.e. they interact with nothing 
therefore nothing happens to them! 


4.4.2 The Geometry of Spacetime 


In addition to its points, spacetime is characterized by its Geometry. In Special 
Relativity it is assumed that the Geometry of spacetime is absolute, in the sense that 
the relations which describe it remain the same — are independent — of the various 
physical phenomena, which occur in physical systems. The Geometry of spacetime 
is determined in terms of a number of assumptions which are summarized below: 


4.4 The Physical Definition of Spacetime 99 


1. Spacetime is a four dimensional real linear space. 

2. Spacetime is homogeneous. From the geometric point of view that means that 
every point in spacetime can be used equivalently as the origin of coordinates. 
Concerning the Physics, that means that where and when an experiment (i.e. 
event) takes place has no effect on the quality and the values of the dynamical 
variables describing the event. 

3. In spacetime there are straight lines, that is unbounded curves which are 
described geometrically with equations of the form 


r=as+b 


where s € R, a,b € R*. From the Physics point of view these curves are the tra- 
jectories of special motions of physical systems in R+ which we call relativistic 
inertial motions. All motions which are not relativistic inertial motions we call 
accelerated motions. With each accelerated motion we associate a four-force in 
a manner to be defined later on. 

4. Spacetime is isotropic, that is all directions at any point are equivalent. The 
assumptions of homogeneity and isotropy imply that the spacetime of Special 
Relativity is a flat space, or equivalently, has zero curvature. In practice this 
means that it is possible to define a coordinate system that covers all spacetime, 
or equivalently, spacetime is diffeomorphic to (i.e. looks like) the linear space 
ae 

5. Spacetime is an affine space, that is, if one is given a straight line or a hyperplane 
(=three dimensional linear subspace with zero curvature) in spacetime then 
(axiom!) there exists at least one hyperplane which is parallel to it, in the sense 
that it meets both the straight line or the other hyperplane at infinity. 

6. If we consider a straight line in spacetime then there is a continuous sequence 
of parallel hyperplanes which cut the straight line once and fill up all spacetime. 
We say that these hyperplanes foliate spacetime. Due to the fact that there is 
no preferred (i.e. absolute) straight line in spacetime there are infinitely many 
foliations. In Newtonian spacetime there is the absolute straight line of time. 
Therefore there is a preferred foliation (that of cosmic time). (see Fig. 4.3). 

7. Spacetime is a metric vector space. The necessity of the introduction of the 
metric is double. (a) It selects a special type of coordinate systems (the 
Cartesian systems of the metric) which are defined by the requirement that in 


Fig. 4.3, Newtonian and 
(Special) relativistic 


foliations of spacetime. (a) 4 
Newtonian foliation. (b) 


Relativistic foliation i 


A 


100 4 The Foundation of Special Relativity 


these coordinate systems the metric has its canonical form (i.e. diagonal with 
components +1). These coordinate systems we associate with the Relativistic 
Inertial Systems. (b) Each timelike straight line defines the foliation in which the 
parallel planes are normal to that line. 

8. The metric of spacetime is the Lorentz metric, that is the metric of the four 
dimensional real space whose canonical form is (—1, 1, 1, 1). The selection of 
the Lorentz metric is a consequence of the Einstein Relativity Principle as will 
be shown in Sect. 4.7. The spacetime endowed with the Lorentz metric we call 
Minkowski space (see also Sect. 1.4). In the following we prefer to refer to 
Minkowski space rather than to spacetime, because Special Relativity is not a 
theory of space and time but a theory of many more physical quantities, which 
are described with tensors defined over Minkowski space. In addition it is best to 
reserve the word spacetime for General Relativity. 


4.5 Structures in Minkowski Space 


The Lorentz metric is very different from the Euclidian metric. The most character- 
istic difference is that the Lorentz distance of two different spacetime points can be 
> 0,< 0 or even = 0 whereas the Euclidian distance is always positive. Using this 
property we divide at any point spacetime in three regions. 


4.5.1 The Light Cone 


We consider all points Q of Minkowski space whose Lorentz distance from a fixed 
point P equals zero, that is: 


(PQ) =0. 


If the coordinates of the points P, Q in some coordinate system are (ct(Q), r(Q)) 
and (ct(P), r(P)) respectively we have: 


(r(Q) — r(P))* — c*(t(Q) — t(P))” = 0. 


This equation defines a three dimensional conic surface in Minkowski space whose 
apex is at the point P. This surface we call the light cone at P and the vectors PQ! 
null vectors. At each point in Minkowski space there exists only one null cone with 
vertex at that point. Obviously different points have different light cones! 

The light cone divides the Minkowski space in three parts (A), (B), (C) as shown 
in Fig. 4.4. Region (A) contains all points Q whose Lorentz distance from P is 
negative and the zeroth component is positive. This region we call the future light 
cone at P and the four-vectors P Q' future directed timelike four-vectors. Region 


4.5 Structures in Minkowski Space 101 


Fig. 4.4 The light cone 


(B) contains all points Q whose Lorentz distance from the reference point P is 
negative and their zeroth component is negative. This region we call the past light 
cone at P and the four-vectors PQ! past directed timelike four-vectors. Finally 
region (C) consists of all points whose Lorentz distance from the reference point P 
is positive. 

In Special Relativity we consider that region (A) (respectively (B) ) contains 
all events from which one can receive(respectively emit) information from (respec- 
tively to) the point P. We say that regions (A), (B) are in causal relation with the 
reference point P. 

The light cone at P concerns all the light signals which arrive or are emitted from 
P. These signals are used to transfer information from and to P. 

The events in the region (C) are not connected causally with the reference point 
P, in the sense that no light signals can reach P from these points. The event 
horizon at the point P is the null cone at P. 

The fact that some events of Minkowski space “do not exit” (in the context that 
it is not possible to get or send information to these events with light signals) for 
a given point in Minkowski space, it is difficult to understand due to the absolute 
and instantaneous possibility (signals with infinite speed!) of global knowledge of 
Newtonian Physics. However because light transfers information with a finite speed 
it is possible that no information or interaction can reach points in Minkowski space 
beyond a Lorentz distance zero. 


4.5.2. World Lines 


In Newtonian Physics the motion of a Newtonian mass point is represented by its 
trajectory in E>. In Special Relativity the “motion” of a “relativistic mass point” is 


102 4 The Foundation of Special Relativity 


represented by a curve in Minkowski space. This curve we call the world line of the 
relativistic mass point.* The world line represents the history of the mass point, in 
the sense that each point of the curve contains information about the motion of the 
mass point collected by the comoving or proper observer (the proper observer we 
shall define later on) of the mass point. The photons are not considered to be mass 
points in this approach. The events in spacetime are absolute, in the sense that the 
information they contain is independent of who is observing the events.” Because 
the description of the information depends on the observer describing the event we 
assume that the “absolute” (or reference) information is the one provided by the 
proper observer of the physical system (photons excluded). Furthermore we assume 
the the description of the proper observer is identical to the one of the Newtonian 
observer. 


4.5.3 Curves in Minkowski Space 


Let x'(t) a curve in Minkowski space parameterized by the real parameter t. The 
tangent vector to the curve at each point is the four-vector: 


ul = dx' /dt. 


From the totality of curves in spacetime there are three types of curves which are 
used in Special Relativity. 


a. Timelike curves 


These are the curves in Minkowski space whose tangent vector at all their points is 
timelike, that is: 


niju'ul <0. 


The geometric characteristic of these curves is that their tangent four-vector at 
every point lies in the timelike region of the null cone at that point. Concerning 
their physical significance we assume that the timelike curves are world lines of 
relativistic mass points, that is, particles with non-zero mass. Because the speed 
of particles with non-zero mass is less than c we call them bradyons (from the 
Greek word Bpeadvus). All known elementary particles with non-zero rest mass are 
bradyons e.g. electron, proton, pion, lepton etc. 


4A relativistic mass point should be understood as a particle with non-zero proper mass and speed 
< c. In Sect. 6.2 we shall give a geometric and precise definition of the relativistic mass point. 
5This does not mean that different observers attribute the same coordinates to a given event. But 
there is a coordinate transformation which brings the coordinates of one observer to the values of 
the coordinates of the other observer and vice versa. 


4.5 Structures in Minkowski Space 103 


b. Null curves 


These are the curves in Minkowski space whose tangent vector p! at all points is 
null: 


nijp' p! = 0. 


The null curves lie entirely on the light cone and we assume that they represent 
the world lines of relativistic particles of zero proper mass and speed c. One such 
particle is the photon (and perhaps the neutrino). These particles we call luxons. 


c. Spacelike curves 


These are the curves whose tangent vector n’ at all their points is spacelike, that is: 
Ni jnint > 0. 


The tangent vector of these curves at all their points lies outside the light cone. 
We assume that these curves represent the field lines of various dynamical vector 
fields e.g. the magnetic field. The study of the geometry of these curves can be 
used to rewrite the dynamic equations of these fields in terms of Geometry and 
then use geometrical methods to deal with physical problems. This we shall not do 
in the present book but the interested reader can look up information on the web 
about the term spacelike congruences. At some stage people attempted to associate 
with these curves particles with imaginary mass and speed greater than c. These 
“particles” have been named tachyons (from the Greek work tax us which means 
fast). Although the theory does not exclude the existence of such particles, it is safer 
that we restrict the role of the spacelike curves to field lines of dynamic fields. 

We emphasize that the three types of curves we considered do not exhaust the 
possible curves in Minkowski space. For example we do not consider curves which 
are in part spacelike and in part timelike. These curves do not interest us. Finally we 
note that each particular set of curves we considered is disjoint in the sense that a 
given curve cannot belong to two different sets. This corresponds to the fact that a 
bradyon can never be a luxon or a tachyon and vice versa. 


4.5.4 Geometric Definition of Relativistic Inertial Observers 
(RIO) 


In Newtonian Physics we divide the curves in E> in two classes: The straight lines 
and the rest. The straight lines are identified with the trajectories of the Newtonian 
Inertial Observers and the rest with the trajectories of the accelerated observers. Fur- 
thermore all linear coordinate transformations in E? which preserve the Euclidian 
distance define the Galileo transformation we considered in Sect. 3.5. The Galileo 
transformation relates the measurements of two Newtonian Inertial Observers. 


104 4 The Foundation of Special Relativity 


Fig. 4.5 World line 
accelerated observer RIO 3 


RIO 2 


RIO 1 


In analogy with the above we define in Special Relativity the Relativistic 
Inertial Observers (RIO) as the observers whose world lines are timelike straight 
lines in Minkowski space. The world lines which are not straight lines it is assumed 
that correspond to accelerated relativistic mass points. Furthermore the group of 
linear transformations of Minkowski space which preserve the Lorentz metric we 
call the Poincaré group and a closed subgroup of it (to be defined properly 
later) is the Lorentz group consisting of the Lorentz transformations. The Lorentz 
transformation relates the measurements of two RIO. 

The identification of the world lines of RIOs with the timelike straight lines in 
Minkowski space is compatible with the concept of foliation of Minkowski space, 
which has been mentioned in Sect. 4.4. 

Special Relativity and Lorentz transformations involve straight lines only. How 
one can study accelerated motions which are described by non-straight lines? This 
is done as in Newtonian Physics (see Sect. 3.3.1) by means of the Local Relativistic 
Inertial Observers (LRIO) defined as follows. Each world line of an accelerated 
motion can be approximated by a great number of straight line segments as shown 
in Fig. 4.5. Each of these straight line segments defines the world line of an RIO, a 
different RIO at each point of the world line. The RIO at a point of the world line 
we call the LRIO at that point and assume that an accelerated relativistic motion is 
equivalent to a continuous sequence of inertial motions. This is a point to which we 
shall return when we study four-acceleration. 


4.5.5 Proper Time 


The world lines are parameterized by a real parameter. Out of all possible param- 
eterizations there is one class of parameters, called affine parameters, which are 
defined by the requirement that the tangent vector u! = aa has fixed Lorentz length 
at all points of the world line. It is easy to show that if t is an affine parameter, then 
t’ =at + where a, 6 € R is also an affine parameter. Out of all possible affine 


4.5 Structures in Minkowski Space 105 


parameters, we select the ones for which the constant length of the tangent vector is 
c*, that is we demand: 


dx! dx/ 


We identify this affine parameter with the time measured by the proper observer 
of the world line and call proper time. The demand that the length of the tangent 
vector equals a universal constant makes all proper clocks have the same rate for all 
RIOs. Therefore, the only freedom left for the proper time is to set the “time zero” 
along the world line. 


4.5.6 The Proper Frame of a RIO 


Let x! (t) the world line of a relativistic (not necessarily inertial) observer where tT 
is an affine parameter. At each point P along the world line there exists a comoving 
observer, that is an observer with respect to whom the 3-velocity of the relativistic 
observer equals zero. From all LCF® systems in Minkowski space we select the one 
which satisfies the following conditions: 


(a) Its origin is at P, that is rp = 0. 
(b) The tangent vector of the world line at P has components u; = (c, 0). 


This LCF we call the instantaneous proper frame denoted by a. If the 
proper observer is accelerating — equivalently the world line is not a straight line 
in Minkowski space — then the instantaneous proper frame is different from point 
to point along the world line. Every instantaneous proper frame corresponds to an 
instantaneous proper observer as we described in Sect. 4.5.4. The aggregate of all 
these frames comprises the proper frame of the accelerated observer. 

If the relativistic observer & is a RIO then the instantaneous proper frame is the 
same all along the world line, we call it the proper frame of the RIO and denote 
with 5°. 


4.5.7 Proper or Rest Space 


At every point P along the world line of a relativistic observer, affinely parameter- 
ized with proper time tT, we consider the (Lorentz) hyperplane normal to the world 
line at the point P. This hyperplane we call the proper space of the observer at the 


Recall that a LCF (Relativistic Cartesian Frame) is a coordinate system in which the Lorentz 
metric has its canonical form diag(—1, 1, 1, 1). 


106 4 The Foundation of Special Relativity 


Fig. 4.6 Proper spaces of an Proper Space 7, 
accelerated observer 


Proper Space 7, 


World Line of an 
Accelerated Observer 


point P. For the case of a RIO the proper space at every point along its world line 
is the space E>. The proper spaces of a RIO are parallel and create a foliation of 
Minkowski space as shown in Fig. 4.3. 

The proper spaces of an accelerated observer are not parallel (see Fig. 4.6). 

If the proper spaces and the proper times of two RIO X1, 2 coincide then we 
consider them to be the same observer. If two RIO X1, X2 move with constant 
relative velocity then their world lines (which are straight lines) make an angle in 
Minkowski space and their proper spaces intersect as shown in Fig. 4.3. The proper 
spaces of each RIO give a different foliation of Minkowski space.’ 

The angle ¢ between the world lines we call rapidity and it is given by the 
relation: 


tanh@ = Bf, coshd=y, sinhd= By (4.4) 


u 


where B = © and uw is the relative speed of X1, U2. We note that tanh ¢ takes its 
upper limit when 6 = | i.e. u = c that is when the relative speed of the two RIO 
equals the speed of light in vacuum. This emphasizes the limiting character of c in 
Special Relativity. 


4.6 Spacetime Description of Motion 


The Theory of Special Relativity is a theory of motion, therefore the concept of 
position vector is fundamental. That is, one has to specify the means and the 
procedures (directions of use) which must be given to a RIO in order to determine 


7It is instructive to mention at this point that Newtonian Physics can be formulated in a four 
dimensional Euclidian space, where one dimension is for the time and three dimensions are for 
the Euclidian space E> in the same way it is done in Minkowski space. This four-dimensional 
space is foliated by the hyperplanes E>, however due to the absolute nature of time, this foliation 
is the same for all Newtonian Inertial Observers. The different foliations of Special Relativity and 
the unique foliation of Newtonian Physics is the fundamental difference (apart from the character 
of the metric) between the two theories. 


4.6 Spacetime Description of Motion 107 


the components of the position vector of an event in spacetime. The use of rigid rods 
is out of the question because they take one back to Newtonian Physics. Instead 
the properties of light signals must be used because they are universal in Special 
Relativity and, furthermore, light is the fundamental relativistic physical system of 
that theory. 

Following the above, we equip the relativistic observers with two measuring 
devices: 


(a) A photongun 


This is a device which emits beams of photons and has a construction which 
makes possible the specification of the direction of the emitted beam. For example 
the photongun can be a monochromatic small laser emitter placed at the center of a 
transparent sphere on the surface of which there are marked equatorial coordinates 
which make possible the reading of the spherical coordinates of the emitted beam. 


(b) A personal clock (proper clock) 


This is every physical system which produces numbers in a specified way (see 
below) and it is used by the observer to associate a number with each distance 
measurement. This number is the time or zeroth component of the position four- 
vector. We demand that the proper clock: 


1. Will be the same for all RIO 

2. The rate of production of numbers by the proper clock will be constant and 
independent of the way a RIO moves. 

3. It will work continuously. 


We identify the numbers produced by the proper clock with the proper time of 
the RIO. Each RIO has its own proper clock and the clocks of two RIO in relative 
motion cannot coincide except at one point, which is the event for which both RIO’s 
set the indication of their clocks to zero. This activity is called synchronization of 
the clocks. We see that in Special Relativity there is no meaning of speaking about 
“time” because there is not the unique, universal clock which will measure it and the 
corresponding Newtonian Inertial Observers who will use it as a common reference. 

From the definition of the proper clock it is seen that the proper time of 
a relativistic observer increases continuously. This means that the world line is 
oriented, that is one can have a sense of direction along this line. We say that this 
direction defines the arrow of time for each observer. It is accustomed to say that 
at each point of a worldline there exists past and future. This terminology is not 
quite successful and can cause confusion because these terms refer to the Newtonian 
conception of the world, where time is absolute and universal for all Newtonian 
observers. In Special Relativity each RIO has its own past and its own future. 

The photongun and the proper clock can be used by a relativistic observer in 
order to perform two fundamental operations: 


108 4 The Foundation of Special Relativity 


1. To determine if it (not he/she!) is a RIO 

2. In case it is a RIO, and only then, to determine the coordinates of the position 
four-vector of events in spacetime following a measuring procedure to be 
specified below. 


The two operations are different, independent and equally important. We discuss 
each of them below. 


4.6.1 The Physical Definition of a RIO 


In order a relativistic observer to testify that it is a RIO must follow a procedure 
which is identical with the corresponding procedure of a Newtonian observer with 
the sole difference that the Newtonian gun is replaced with the photongun. There 
is no point repeating this procedure here and we refer the reader to Sect. 3.3.1 
where the procedure is described in detail and ask him/her to simply change the 
word gun with photongun. Any three relativistic inertial directions which specify a 
RIO, © say, are called relativistic inertial directions of &. If a relativistic observer 
cannot determine three independent inertial directions the observer is an accelerated 
relativistic observer. 

The question which arises is: Do RIOs exist? The answer is given by the 
following Axiom. 


4.6.1.1 Axiom of Relativistic Inertia 


There exist relativistic observers, whose world lines in Minkowski space are timelike 
straight lines. These lines we call lines of time. For these observers there are at most three 
independent relativistic inertial directions in physical space, which is equivalent to the fact 
that physical space is three dimensional. 


This axiom has physical and geometric consequences. 

Concerning the physical consequences the axiom declares that a RIO perceives 
the physical space as continuous, isotropic, homogeneous and three dimensional 
that is, exactly as the Newtonian observers do. Furthermore their proper time is 
independent of their perception of space and it is described by a one dimensional 
Euclidian space. In conclusion a RIO is identical with the typical Newtonian Inertial 
Observer, as far as the perception of space and time is concerned. However different 
RIOs have different perceptions of space and time. 

Any three independent inertial directions determined by a RIO define in physical 
space a frame which we call a Relativistic Light Frame. A RIO can always find 
inertial directions which are mutually perpendicular (in the Euclidian sense). The 
frame defined by such directions we call a Lorentz Cartesian Frame (LCF). 

Concerning the geometric implications of the Axiom they are the following: 


4.6 Spacetime Description of Motion 109 


a. The Axiom associates each RIO with one timelike straight line in Minkowski 
space, therefore with a definite foliation of Minkowski space. The three dimen- 
sional hyperplanes of this foliation (i.e. the proper spaces of the RIO) correspond 
to the perception of the physical space by that RIO. 

b. The Lorentz Cartesian frames correspond with the LCF’s of Minkowski 
space, that is, in these frames the Lorentz metric attains its canonical form 
diag(—1, 1, 1, 1). 


We note that the effect of the Axiom of relativistic inertia is similar to First 
Newton’s Law, that is, it geometrizes the concept of RIO as well as the concept 
of LCF. 


4.6.2 Relativistic Measurement of the Position Vector 


Having established the concept of a RIO and that of a LCF we continue with the 
procedure of measurement of the coordinates of the position vector in spacetime by 
a RIO. This procedure we call chronometry. 

Consider the RIO & and a point P in spacetime whose position vector is to be 
determined. We postulate the following operational procedure. 

The RIO © places at the point P a small plane mirror and sends to P a light 
beam at the indication t of the proper clock. There are two possibilities. Either 
the light beam is reflected on the mirror and returns to the RIO along the same 
direction of emission or not. If the second case occurs & changes the direction of 
the photongun until the first case results. This is bound to happen because there are 
inertial directions for a RIO. Then the RIO fixes that direction of the photongun and 
reads: 


1. The time interval 2At(P) between the emission and the reception of the light 
beam 

2. The direction e, of emission of the light beam using the scale of measurement of 
directions of the photongun. 


Subsequently & defines the position vector r(P) of P in 3-space as follows: 
r(P) = cAt(P)e, 


where c is the speed of light in vacuum. Concerning the time coordinate of the point 
P, & sets the number ct + cAt(P) where T is the indication (proper time) of the 
proper clock at the event of emission of the beam. The coordinates of the position 
four-vector of the event P become® (ct + cAt(P), cAt(P)e,). We see that the 
measurement of the components of the position vector requires the measurement of 


8We write ct instead of t for the time component, because the components of a vector must have 
the same dimensions, that is space length [L]'[7 tM). 


110 4 The Foundation of Special Relativity 


two readings of the proper clock and one reading of the scale of directions of the 
photongun. 


4.6.3 The Physical Definition of a LRIO 


Since chronometry has been defined only for RIO, many times it is created the 
erroneous point of view that Special Relativity cannot study accelerated motions. If 
that were true, then that theory would be of pure theoretical interest, because inertial 
motions are the exception rather than the rule. 

The extension of chronometry to non-relativistic inertial observers is done via the 
concept of Locally Relativistic Inertial Observer (LRIO) we defined geometrically 
in Sect. 4.5.4. In the present section we define the LRIO from the Physics point of 
view. 

Consider a relativistic mass which is accelerating wrt the RIO &. Let P be a 
point along the trajectory of the mass point where r(t), v(t), a(t) are the position, 
the velocity and the acceleration vector at time ¢ in & respectively. 

We consider another RIO ©’ which wrt © has velocity u = v(t). This observer 
is the LRIO of the mass point at the event P. Due to the acceleration at time f + dt 
the position of the moving mass will be r(¢ + dt) and its velocity v(t + dt) therefore 
at the point r(¢ + dt) there is a different LRIO. Because the path of the moving 
mass is independent of observation, the continuous sequence of LRIO is inherent 
to the motion of the mass. This sequence of LRIO’s we call the proper observer 
of the accelerating mass point. With this mechanism we extend chronometry to the 
accelerated observers. 

There is another way to understand accelerated relativistic observers by relaxing 
the condition that the Lorentz transformation is linear and preserves the canonical 
form of the Lorentz metric. This approach takes us very close to the theory of 
General Relativity. More on this topic we shall say in Chap. 7. 


4.7 The Einstein Principle of Relativity 


The “world” of a theory of Physics consists of all physical quantities of the theory 
which are determined by the Principle of Relativity of the theory as explained in 
Chap. 2. For example the Galileo Principle of Relativity determines the physical 
quantities of Newtonian Physics. In the same spirit the Einstein Principle of 
Relativity determines the physical quantities of Special Relativity. More specifically 
the Einstein Principle of Relativity acts at three levels: 


(a) Defines a code of exchange of information (transformation of “pictures’”) 
between RIOs and determines the group of transformations of the theory 
(Poincaré group). 


4.7 The Einstein Principle of Relativity 111 


(b) Defines a Covariance Principle which determines the kind of geometric objects 
which will be used by the RIOs for the mathematical description of the 
relativistic physical quantities (Lorentz tensors). 

(c) Defines the mathematical nature of the equations of the theory (Covariance 
Principle). 


4.7.1 The Equation of Lorentz Isometry 


Consider a RIO & who observes a light beam passing through the points A, B of 
real space. Let Ar be the relative position vector of the points as measured by & and 
At the time required according to the proper clock of & in order the light to cover 
the distance AB. From the Principle of the constancy of the speed of light we have: 


Ar’ — At =0 
where c is the speed of light in vacuum.” For another RIO ¥’ who is observing the 


same light beam (events) let the corresponding quantities be Ar’ and Ar’. For D’ 
we have again the relation: 


The last two equations imply: 


Ar? — 2 At? = Ar? — 2 A?. 


This equation relates the measurements of the components of events referring 
to the light beam and it is a direct consequence of the Principle of the constancy 
of c. The question which arises is: What will be the relation for events concerning 
mass points e.g. an electron? The answer has been given — and perhaps this is his 
greatest contribution — by Einstein who stated the following Principle and founded 
the Theory of Special Relativity: 


Definition 4.7.1 (Principle of Communication of Einstein) Let (cAt, Ar), 
(cAt’, Ar’) be the components of the position four-vector AB‘ (not necessarily 
null!) as measured by the RIOs =, D’ respectively. Then the following equation 
holds: 


12 12 


Ar’* — c?At’* = Ar* — c7 At? (4.5) 


°Note that the four-vector A B! is defined by the events: A: The light beam passes the point A and 
B: The light beam passes the point B. The four vector AB’ is a null vector. 


112 4 The Foundation of Special Relativity 


which determines the communication and transformation of measurements for one 
RIO to the other. 


The physical significance of (4.5) is that it determines the code of communication 
(i.e. the code for transferring information) between RIOs. We find the geometric 
significance of (4.5) if we write it in the LCF of the ©, X’ respectively. In these 
frames (and only there!) (4.5) takes the form: 


(cAt, Ax, Ay, Az)diag(—1, 1, 1, 1I)(cAt, Ax, Ay, Az) = 
= (cAt’, Ax’, Ay’, Az')diag(—1, 1, 1, 1)(cAr’, Ax’, Ay’, Az”. (4.6) 


The lhs of equation (4.6) (see also Sects. 1.3 and 3.5) consists of two parts: 


e A. The | x 4 matrix(cAt, Ax, Ay, Az) and 
¢ B.The4 x 4 matrix diag(—1,1,1, 1). 


The first matrix concerns the chronometric measurements of the RIO and 
quantifies the first relativistic physical quantity, the position four-vector x; = 
(ct, r) whose components depend on the RIO. The second matrix is independent of 
the RIO.!° The matrix di ag(—1, 1, 1, 1) can be considered as the canonical form of 
a metric in a four-dimensional real vector space. If this is done, then this metric is 
the Lorentz metric. If we set 7 = diag(—1, 1, 1, 1) relation (4.6) is written: 


(ct,r)'(n — L'nL)(ct, r) = 0 


where L is the transformation matrix relating ©, &’ and defined by the requirement 
(ct’, x’)! = (ct, r)'L‘ (t = transpose matrix). This relation must be satisfied for all 
position four-vectors therefore we infer that the matrix L is defined by the relation: 


n= Lyk. (4.7) 


Relation (4.7) is the isometry equation for the Lorentz metric between two LCFs. We 
infer that the coordinate transformations between RIOs using LCF are the Lorentz 
transformations we introduced in Sect. 1.3 equation (1.31). Therefore, as in the case 
in Newtonian Physics where the Galileo Relativity Principle introduces the Galileo 
transformations, in Special Relativity the Einstein Relativity Principle introduces 
the Lorentz transformations as the transformations relating the components of four 
vectors (and tensors in general) of two RIOs. These ideas are described pictorially 
in Fig. 4.7. 

The general solution of equation (4.7) has been given in Sect. 1.6 and it is a matrix 
of the form L(B, FE) = L(B)R(£). In this solution the matrix L(B) is the pure 
relativistic part of the Lorentz transformation and it is given by equations (1.44) and 
(1.45). It is parameterized by the relative velocity 6 of the LCF whose space axes 


10However the form di ag(—1, 1, 1, 1) changes if frames different than LCF are used. 


4.8 The Lorentz Covariance Principle 113 


Fig. 4.7 Principle of Position four vector 
communication of Einstein 


Chronometry 1 Chronometry 2 


(ct, r) = u > (ct’,r’) 


(f)-1 


are parallel. The matrix R(E) is a generalized Euclidian Orthogonal transformation 
which depends on three parameters (e.g. Euler angles) and makes the axes of the 
LCFs parallel. 


4.8 The Lorentz Covariance Principle 


In order to define the mathematical description of the relativistic physical quantities 
we must define the mathematical nature of the fundamental physical quantities of 
the theory. These quantities in Newtonian theory are two: One vector (the position 
vector) and one invariant (the time). In Special Relativity we also have two such 
quantities. It is the position four-vector and the invariant speed of light in vacuum. 
Based on this observation we define the first part of the Lorentz Covariance Principle 
as follows: 


4.8.1 Lorentz Covariance Principle: Part I 


The position four-vector and the Lorentz distance (equivalently the Lorentz metric) 
are relativistic physical quantities. 

The position four-vector and the Lorentz distance are not sufficient for the 
development of Special Relativity in the same manner that the position vector and 
the time are not enough for the development of Newtonian Physics. Therefore we 
have to introduce more relativistic physical quantities and this is done by means 
of the Covariance Principle. In Newtonian Physics the position vector and the time 
defined the mathematical character of the Newtonian physical quantities via the 
Galileo Principle of Relativity and the second part of the Principle of Covariance. 
Similarly in Special Relativity the position four-vector and the spacetime distance 
define the nature of the relativistic physical quantities via the Lorentz Covariance 
Principle Part II: 


114 4 The Foundation of Special Relativity 
4.8.2 Lorentz Covariance Principle Part II 


The relativistic physical quantities are described with Lorentz tensors. 
This means that the relativistic physical quantities are described with geometric 
objects which: 


(a) Are tensors of type (m,n) in Minkowski space, that is they have 4” x 4” 
components (m,n = 0,1, 2,3,...), where m is the number of contravariant 
indices and n the number of covariant indices. 

(b) If ‘or and t are the components of a Lorentz tensor T wrt the RIO ’ 
and & respectively, then: 

Up -pibeg! art ae 
Ti = Lh; Li 2 Li Ly... Tap (4.8) 
where Lt is the Lorentz transformation which relates © and &’ and ee is its 
inverse. 


4.8.3 Rules for Constructing Lorentz Tensors 


The rules of Proposition 1.4.1 which were applied in order to define new Newtonian 
tensors from other Newtonian tensors also apply to Lorentz tensors. That is we have 
the following rules: 


Rule 1 
Lorentz Lorentz Lorentz 
d Tensor /d = Tensor . (4.9) 
type (7, 5) Invariant type (7, s) 
Rule 2 
Lorentz Lorentz Lorentz 
Tensor = Tensor : (4.10) 
Invariant type (r,s) type (r, s) 


In Newtonian Physics Rule | is used to define the kinematic quantities of 
the theory (velocity, acceleration, etc.) and Rule 2 is used for the definition of 
the dynamical quantities (momentum, force etc.) In an similar manner in Special 
Relativity Rule | defines the four-velocity, the four-acceleration and Rule 2 the 
four-momentum, the four-force etc. These four-vectors we study in the chapters 
to follow. 


4.9 Universal Speeds and the Lorentz Transformation 115 
4.8.4 Potential Relativistic Physical Quantities 


We come now to the following point: Does every Lorentz tensor represent a 
relativistic physical quantity? The answer to this question is the same as the 
one we gave in Newtonian Physics where we introduced the potential Newtonian 
physical quantities. These quantities were Euclidian tensors which possibly describe 
a Newtonian physical quantity. Based on that we consider that each Lorentz tensor 
is a potential relativistic physical quantity and in order to be a relativistic physical 
quantity must satisfy the following criteria: 


e In case it has a Newtonian limit for a characteristic observer (e.g. the proper 
observer) then that limit must correspond to a Newtonian physical quantity 

¢ Incase does not have a Newtonian limit (e.g. the speed of light) then will obtain 
physical status only by means of a principle (e.g. as it is done with the speed of 
light in vacuum). 


In the chapters to follow we shall have the chance to see how these criteria are 
applied in practice. 


4.9 Universal Speeds and the Lorentz Transformation 


The purpose of the determination of the (proper) Lorentz transformation in 
Sect. 1.6.1 as the solution of the isometry equation L'nL = 7 was to show that 
the Lorentz transformation as such is of a pure mathematical nature. The Physics of 
the Lorentz transformation comes from its kinematic interpretation. 

In this section we shall derive the proper Lorentz transformation using the 
physical hypothesis that there exist in nature universal speeds, not necessarily one 
only. This approach is closer to the standard approach of the literature however 
it has a value because it produces at the same time the Galileo transformation 
and incorporates the kinematic interpretation of the transformation. Furthermore 
this derivation of the Lorentz transformation, although simple, is useful because it 
follows an axiomatic approach which familiarizes the reader with this important 
methodology of theoretical Physics.!! 


'lTn the literature one can find many derivations of the proper Lorentz transformation — mainly of 
the boosts — using a more or less axiomatic approach. Some of them are: 


1. D. Sardelis “Unified Derivation of the Galileo and the Lorentz transformation” Eur. J. Phys. 
(1982) 3, 96-99 

2. A.R. Lee and T.Malotas “Lorentz Transformations from first Principles” Am. J. Phys. (1975) 
43, 434-437 and Am. J. Phys. (1976) 44, 1000-1002 

3. V.Berzi and V. Gorini “Reciprocity Principle and the Lorentz Transformation” (1968) J. 
Math. Phys. 10,1518-1524 

4. C. Fahnline “A covariant four-dimensional expression for Lorentz transformations” Am. J. 
Phys. (1982) 50, 818-821. 


116 4 The Foundation of Special Relativity 


Consider two observers (not NIO or RIO but just to observers who in some 
way produce coordinates for the physical phenomena in some four-dimensional 
space) &, &X’ with coordinate systems (ct, x, y, z) and (ct’, x’, y’, z’). We relate 
these coordinates with a number of mathematical assumptions which must satisfy 
the following criteria: 


(a) Have a physical meaning or significance 
(b) Are the least possible 
(c) Lead to a unique result. 


Assumption I There are linear transformations among the coordinates (ct, x)and 
(ct’, x’) which are parameterized by a real parameter V (whose physical signifi- 
cance will be given later). 


This assumption implies that we can write: 


ct a, a2 ct’ VY 
= 4.11 
A bi . Mid oe 
where a1, a2, 61, 62, y, 6 are well defined (i.e. continuous and infinitely differen- 


tiable) real functions of the real parameter V. We assume that the parameter V is 
such that the following assumption is satisfied: 


Assumption II 


(a) ft =t' =Othen x =x’ =0 (4.12) 
(b) If x’ =O for allt’, then x = Vt (4.13) 
(c) If x =O forall t, then x’ = —V?’ (4.14) 
(d) If V =Othenx =x’ andt =/. (4.15) 


Let us examine the implications of Assumption IJ on geometry and kinematics. 
Condition (a) means that the transformation (4.11) is synchronous, that is, when the 
origins of © and &’ coincide then the times ¢ and ¢’ are made to coincide. Condition 
(b) means that the plane (y’, z’) which is defined with the equation x’ = t’ = 0 is 
always parallel to the plane (y, z) which is defined at each time coordinate t with 
the relation x = Vt. Similar remarks hold for condition (c) with the difference 
that now the plane (y, z) is kept parallel to the plane (y’, z’). Finally, condition (d) 
implies that for the value V = 0 the two coordinate systems coincide and & is not 
differentiated from &’. 

From (b) and (c) follows that the parameter V has dimensions [LT~'] therefore 
its physical meaning is speed. We continue with the consequences of Assumption II 
on the coefficients a1, 2, 61, 62. Equation (4.12) implies y = 6 = 0 hence the 
transformation is homogeneous. Equation (4.13) implies: 


_ Vay 


Bi 


Cc 


4.9 Universal Speeds and the Lorentz Transformation 117 


Similarly equation (4.14) gives: 


Vo 
pb, = —. 
Cc 
From these follows: 
ay = fo. 


Therefore, the transformation equation (4.11) becomes: 


ct Bo a2 ct 7 
= ; 4.16 
I=L ell a 

In this expression we have two unknowns — the functions a2, 62 — thus we need 
two more assumptions before we determine the transformation. 

We introduce the quantities U = + and U' = © where (ct, x) and (ct’, x’) 
are the coordinates of a moving particle in the frames © and D’ respectively. The 
quantities U, U’ represent the x— and x’— components of the velocity of the particle 
wrt © and Y’ respectively. From the transformation equations (4.16) follows easily 
(B2 #0): 

x U'+V 


U=— 


= 4.17 
tf iG a 


Equation (4.17) is a general relation which relates the velocities U, U’ with the 


parameter V. In order to determine the ratio a we consider one more assumption: 


Assumption III There are at least two universal speeds one with value infinity and 
the other with finite value c which are defined by the requirements: 


If U —> +00 then U' —> +00 (V finite) 
If U — c then U’ — c. 


The first speed gives (62 4 0): 
a2 =0 (4.18) 
and the second: 
a2 = “ f (4.19) 
Assumption III is not an assumption without content (that is, there do exist 
in nature such speeds) and this is what the experiments of Michelson-Morley 


have shown. Furthermore in Newtonian Physics we accept that the interactions are 
propagating with infinite speed (action at a distance). The value of the constant c 


118 4 The Foundation of Special Relativity 


is the speed of light in vacuum. The above do not exclude further universal speeds, 
however since they have not been found yet we shall ignore them. 

There is still one parameter to determine, therefore we need one more assump- 
tion. This assumption has to do with the isometry property of the transformation and 
is as follows: 


Assumption IV The determinant of the transformation (4.16) equals +1. 


In order to quantify this assumption we compute the determinant D of the 
transformation. We find: 


4 Vag 
D = px - (<) 022 = Ba (a = 2) (4.20) 
In the first case of infinite speed a2 = O and Assumption IV gives B2 = +1. The 
value B2 = —1 is rejected because it leads to t = —t’ for all values of V and this 


contradicts (4.15). Hence 62 = 1 and the transformation becomes: 
t=1' x=x' tv. (4.21) 


These equations are the Galileo transformation of time and space coordinates of 
Newtonian Physics. We note that in this case equation (4.17) gives: 


U=U'+V (4.22) 
which is the formula of composition of velocities of Newtonian Physics. 


We come now to the case of finite speed in which a2 4 0. The assumption 
D = 1 and (4.20) in combination with (4.19) imply: 


v2 
Bo = +(1 — a =+y. (4.23) 


From equation (4.16) we find for V = 0, 62(0) = +1, from which as before we 
take the positive sign only. In this case the transformation is: 


t=y (: = *) : (4.24) 
c 


This is the boost along the common x, x’ axis with relative velocity V. Transforma- 
tion (4.24) is symmetric. This can be seen if it is written as follows: 


x’ = y(x — Bet) 
ct! = y(ct — Bx). (4.25) 


4.9 Universal Speeds and the Lorentz Transformation 119 


Fig. 4.8 Standard y ty! 
configuration of relative : 
motion of two LCF 


‘ 
a ae 
oA J 


Zz z 


where 6 = V/c. We infer that the physical interpretation of the parameter V is that 
it corresponds to the relative speed of X, &’ in the case the x, x’ axes are common 
and they move as in Fig. 4.8. This arrangement of motion we shall call the standard 
configuration in order to economize space. 

We examine condition (4.17) in order to determine the rule of composition of 
velocities for the case of the finite speed. Replacing in (4.24) (or otherwise) we 
compute: 

/ 
U= a. (4.26) 
142% 


This relation is the transformation of the 3-velocities under the boost (4.25). It is 
easy to show that if U = c then U’ = c, as required by the universality of the speed 
c.!? Also we note that the inverse transformation which expresses (ctf, x) in terms 
of (ct’, x’) is the same as (4.24) with the difference that V is replaced with —V. 

All the above assume a common spatial direction in the two LCF &, &’. 
Along this direction the transformation of coordinates between X’ and X has the 
characteristic that it is reversed if we take —V in place of V, that is, instead of 
considering X’ to move wrt X= we consider © to move wrt X’. This symmetry 
of relative motion between X, X’ has been called reciprocity of motion. This 
direction is obviously related to the relative velocity of = and D’, that is Vyy 
and V5 y. This leads us to the next assumption: 


Assumption V 
Vyry = —Vyy’. (4.27) 


This assumption defines the common direction between the & and X’. Its geometric 
role is to “cut” the three dimensional space in two dimensional slices (y, z), (y’, z’) 
(a foliation of the three dimensional space!). Assumption V gives to the real 
parameter V the general physical significance as the relative velocity of © and XD’. 


Universal is a scalar physical quantity which has the same value in all (and the accelerated!) 
coordinate systems of the theory. 


120 4 The Foundation of Special Relativity 


With Assumption V we have assumed that the planes (y, z) and (y’, z’) remain 
parallel, however we have not exclude that the axes y, z do not rotate as the plane 
(y, z) moves along the common axis x, x’. This requires one more assumption: 


Assumption VI Directions perpendicular to the characteristic common direction 
of X and &' do not rotate. 


This implies that for the special type of motion we have considered: 
/ 
yay 
c=2,, (4.28) 


With equations (4.28) we have completed the derivation of the boost (and the Galileo 
transformation!) along the common direction x, x’. 

Using Assumptions V and VI it is possible to produce the general Lorentz 
transformation for which no axis is coinciding with the characteristic common 
direction, the relative velocity is arbitrary and the axes of the coordinate frames 
parallel. However this has already been done in Chap. 1. 


Chapter 5 ®) 
The Physics of the Position Four- Vector eM 


5.1 Introduction 


The fundamental vector for all theories of motion is the position vector. In each 
such theory this vector is defined by means of a definite procedure, which reflects 
the way the theory incorporates the concepts of space and time. The position 
vector is a natural concept in Newtonian Physics due to the sensory relation of 
this theory with the concepts of space and time. However the case is different with 
Special Relativity. The measurement of the position four-vector with light signals 
(chronometry) leads to a relation between the two concepts and necessitates their 
reconsideration. In the present chapter we consider the “relativistic” view of space 
and time and their relation (via a definition!) with the corresponding concepts of 
Newtonian Physics. This correspondence is necessary because we conceive the 
world via our Newtonian sensors. The results we find comprise the Physics of the 
position four-vector and specify the relativistic kinematics at the level of everyday 
experience in the laboratory. 


5.2 The Concepts of Space and Time in Special Relativity 


In Newtonian Physics the measurement of the position vector is achieved by means 
of two procedures (=reading of scales) and three different and absolute (that 
is, the same for all Newtonian Observers) physical systems: The mass gun, the 
(absolute) unit rigid rod and the (absolute) cosmic clock. In Special Relativity 
the measurement of the position four-vector is done with one procedure, the 
chronometry as developed in Sect. 4.6.2, and two physical systems, the photongun 
and the proper clock, none of which is absolute (that is, common for all relativistic 
observers). 


© Springer Nature Switzerland AG 2019 121 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_5 


122 5 The Physics of the Position Four-Vector 


Unfortunately chronometry does not suffice for the study of physical problems 
in practice for the following reasons: 


a. In the laboratory we still use the “absolute” rigid rod of Newtonian Physics 
to measure space distances. Furthermore our clocks are based (as a rule) on 
Newtonian physical systems, which we understand and use as being absolute 

b. The measurement of spatial and temporal differences involves the coordinates of 
two events whereas chronometry is concerned with the position vector of a single 
event. 


Therefore standard practice imposes the question/demand: 


How can one measure Newtonian spatial and Newtonian temporal differences in Special 
Relativity? 


The answer to this question/demand will be given in the pages to follow. More 
specifically we shall define a new procedure, which we call the chronometry of two 
events which will associate the relative spacetime distance of two events, with a 
Newtonian spatial and temporal distance. The misunderstanding of this procedure 
has lead to a type of problems in Special Relativity known as paradoxes. The 
“paradoxicalness” of these problems is always due to the mistaken interpretation of 
the relativistic measurement of either the spatial distance or the temporal difference. 


5.3. Measurement of Spatial and Temporal Distance 
in Special Relativity 


Consider two events P, Q in spacetime with position four-vector! Xs =.= 
(’, r’)S, and xb = (+ dl,r+dr), = (l’+dl',r’ + dr’)5, in the RIO © and 
respectively. The four-vector P Q' has components: 


(PQ)' = (dl, dr) = (dl, dr’). (5.1) 


This information we collect in Table 5.1. 


Table 5.1 Table of = = 
dinat f t t 
coordinates of two events oy “ns Wry, 
QO: d+dl,r+dr)5 | U'+dl', vr’ +dr'y5, 
PO: [di dny dl’, dr’), 


‘With xp = (1, r)S we mean x! = ( ) . The sole reason for this witting is to economize space. 


» 
Recall that according to our convention a contravariant vector has upper index and is represented 


with a column matrix and a covariant vector has lower index and it is represented with a row matrix. 


5.3 Measurement of Spatial and Temporal Distance in Special Relativity 123 


Table 5.2. The 
relativistically allowable 
cases of measurement of 
spatial distance and temporal 
difference 


dl dr j|dl' \dr 
(a) All zero ad case) - 
(b) | One ¥ 0 (4 cases) 
(c) | Two ¥ 0 (6 cases) 
0 40 |40 (0 
#0 |0 0 #0 
40 |40 (0 0 
0 0 #0 |40 
0 #0 |0 #0 


#0 |0 #0 |0 
(d) | Three 4 0 (4 cases) 

#0 |0 #0 |40 

0 #0 |40 |40 

#0 |40 |0 #0 

#0 |40 |40 |0 
(e) | All 4 0 (4 cases) 


Remarks 
Trivial 
Impossible due to (5.2) 


Impossible due to (5.2) 
_//- 

Impossible due to (5.3) 
_/f- 

Impossible due to (5.3), 
(5.4) 

_//- 


Case 1 
Case 2 
Case 3 
Case 4 
The general case 


In order to find all allowable possibilities we consider the relations which connect 
the quantities d/, dr and dl’, dr’. These relations are the following two: 


a. The invariance of the Lorentz length of the four-vector (P Q)': 


— dl? + dr* = —dl’* + dr”. 


(5.2) 


b. The quantities d/, dr and dl’, dr’ are related with a (proper) Lorentz transfor- 
mation according to relations” (see (1.51) and (1.52) ): 


dr =dr' + (U8 dr’ + ar) vB (5.3) 


dl=y (dl'+B-dr’). 


(5.4) 


From relations (5.2), (5.3), and (5.4) we construct Table 5.2, which shows all 
possible cases and indicates which are allowed and which are excluded. 

We note that, besides the most general case (e), it is possible to relate the quan- 
tities dl, dr, dl’, dr’ in a relativistically consistent manner only in four cases, each 
case being characterized by the vanishing of one of the quantities dl, dr, dl’, dr’. 
This result demands that we consider each of these cases as defining the relativistic 
measurement of the spatial and the temporal distance in the RIO © and &’. In order 


Use footnote 24 on page 21 to write Br 8S 557 


y-l Y 
yt+l° 


124 5 The Physics of the Position Four-Vector 


to find the exact role for each case we consider the Newtonian measurement of 
spatial and temporal distance and note that: 


¢ The Newtonian measurement of spatial distance involves two points in space (i.e. 
dr # 0 or dr’ # 0) and takes place at one time moment (i.e. d/ = 0 or dl’ = 0). 

¢« The Newtonian measurement of temporal difference involves one point in space 
(i.e. dr =0 or dr’ = 0) and two time moments (i.e. d/ 4 0 or dl’ $ 0). 


Having as a guide the Newtonian method of measuring spatial distance and 
temporal difference we give the following definition for the measurement of 
the relativistic spatial distance and the measurement of the relativistic temporal 
difference. 


Definition 5.3.1 Condition dr = 0 (respectively dr’ = 0) defines the relativistic 
measurement of temporal difference at one spatial point by the RIO & (respectively 
x’). Condition di = 0 (respectively di’ = 0) defines the relativistic measurement 
of spatial distance at one time moment by the RIO © (respectively ’). 


Definition 5.3.1 suffices in order: 


¢« To define a unique and consistent procedure for the measurement of the 
relativistic spatial distance and the relativistic temporal difference in Special 
Relativity and 

¢ To define a well defined correspondence between the relativistic measurement of 
the spatial distance and temporal difference with the corresponding Newtonian 
quantities. 


We shall need the following result. 


Example 5.3.1 Show that if two events coincide in a RIO & then they coincide in 
all RIO. 
Solution 

Consider two events A, B say, which coincide in the RIO &. Then in & the four- 
vector (AB')y = 0. Any other RIO is related to & via a Lorentz transformation. 
The Lorentz transformation is homogeneous and therefore preserves the zero four- 
vector. This means that in all RIO AB‘ = 0, that is, the events A, B coincide. 


5.4 Relativistic Definition of Spatial and Temporal Distances 


Consider the events P,Q in spacetime and the four-vector PQ! they define. 
There are three possibilities: PQ! timelike, spacelike or null. In each case the 
measurement of the vector PQ! has a different meaning. In accordance with what 
has been said in Sect. 5.3 we give the following definition: 


Definition 5.4.1 If the four-vector PQ’ is timelike then the measurement of the 
distance of the events P, Q concerns temporal difference. If the four-vector PQ’ 
is spacelike the measurement of the distance of the events P, Q concerns spatial 


5.5 Timelike Position Four-Vector: Measurement of Temporal Distance 125 


distance. Finally if PQ! is null then the measurement of the distance of the events 
P, Q is not defined. 


Definition 5.4.1 is compatible with Definition 5.3.1 because the vanishing of 
either di or di’ implies that the position four-vector is spacelike whereas the 
vanishing of either dr or dr’ means that the position four-vector is timelike. We 
conclude that in order to study the four allowable cases of Table 5.2 it is enough to 
study the cases that the position four-vector is timelike and spacelike. 


5.5 Timelike Position Four-Vector: Measurement 
of Temporal Distance 


Consider two events P, Q which define a timelike position four-vector PQ’. Let 
Zbo be the proper observer of the timelike four-vector PQ’. Let © be the world 
line of another RIO whose world line makes (in spacetime) an angle ¢@ with the 
world line of Ey Qo: We are looking for a definition/procedure which will determine 
the time difference of the events P, Q for &. 

In order to do that we recall that the measurement of time difference by a 
Newtonian observer requires that the events occur at the same spatial point. At this 
point one places a clock and reads the indications of the clock when the events 
occur, the difference of these indications being the required temporal difference of 
the events for the observer (and at the same time due to the absolute time for all other 
Newtonian observers!). We transfer these ideas in Special Relativity as follows. 

We consider the two spatial hyperplanes Eh and x5 of aE at the points P, Q. 


These planes contain all spatial events of spacetime which are simultaneous for Ee 0 
with the events P, Q respectively. The world line of & intersects the hyperplanes at 
the points P}, Q1 respectively. We define the temporal difference of the events P, Q 
for the RIO & to be the (Lorentz) distance of the events P;, Q; (see Fig.5.1). 


Fig. 5.1 Relativistic 4 
measurement of temporal 
difference of two events 


126 5 The Physics of the Position Four-Vector 


Let (P Mzt, = 1 be the distance (temporal difference) of the events P, Q as 


measured by the proper observer pas with two readings of his clock (=Newtonian 
way of measuring time difference) and let (P1Qi1)s = Tt be the corresponding 
distance (temporal difference) measured relativistically with the procedure of 
chronometry by &. The two quantities t) and t are related with the Lorentz 
transformation which connects the observers =b9 and &. The components of the 


four-vector PQ! in the LCF Zbo and © are PQ! = (0, 5%, and PQ? = 


(t, PQ)» respectively. The Lorentz transformation for the zeroth component gives: 
T=YTO (5.5) 


where y = ( 1— p) 1”? and for the space component gives the trivial relation 
v= FO. We remark that these relations can be obtained from the hyperbolic 


geometry of an orthogonal triangle if we make the following conventions (see 
Fig. 5.2): 


1. The side AB=temportal difference of the events P, Q as measured by rpg 

2. The side BC = space distance of the events P, Q as measured by & 

3. The angle ¢@ is related to the rapidity (i.e. relative speed) of ro. x with the 
relation: 


cosh¢ = y. 


It is easy to see that with these conventions relation (5.5) is found from the 
(hyperbolic) orthogonal triangle ABC in the form t = to cosh¢ = y To. 


Fig. 5.2 Hyperbolic triangle 
for the measurement of B : 
temporal difference _ 
T0 XG 
lat 
A 
» 
y + 


3 At this point it is important to distinguish between spacetime diagrams and geometric diagrams. 
The first are related to observational procedures whereas the latter are schematic devices intended 
to be used only for computational purposes of temporal and spatial distances in different frames of 
reference. The lengths of the sides drawn in the geometric type of diagrams are not directly related 
to the actual magnitudes of the quantities represented by those sides. 


5.5 Timelike Position Four-Vector: Measurement of Temporal Distance 127 


Because y > 1 the t > Tt. Therefore the result of the measurement of the 
temporal difference of the events P, Q for two RIO is different and the minimum 
value occurs for the proper observer Et Qo In words equation (5.5) can be stated as 


follows*: 
Relativistically Newtonian 
measured measured 
temporal difference =y temporal difference ; (5.6) 
of the events P, Q of the eventsP, Q 
in the LCF of the RIO & in the proper frame zy 0 


We note that the quantity cto = ,/—(P Q)? has the following properties: 


1. Is invariant, therefore the value Tg is the same for all RIO 

2. The value to is determined experimentally with a procedure similar to the 
Newtonian one (i.e. reading the clock indication) however only by the proper 
observer Lb: 


The inequality t > to has been called time dilation. The name is not the most 
appropriate (and has created a lot of confusion) because the quantities t and To refer 
to different measurements of different observers with different methods and concern 
temporal (i.e. coordinate) distances and not time differences. 

We arrive now at the following question: 


Is it possible for a Newtonian observer (as we are!) to measure the relativistic temporal 
difference of two events in spacetime? 


The answer is ‘yes’ and it is given in the following example. 


Example 5.5.1 A clock moves along the x— axis of a RIO & with constant velocity 
u = ui. In order for the observer in & to measure the time difference of the 
“moving” clock, it decides to apply the following procedure: 

The observer considers two positions P, Q along the x—axis which are at a distance 
d(pg) apart and with its clock measures the time interval At required for the moving 
clock to pass through the point P (event A) and through the point Q (event B ). Then 
the observer defines the time interval Ath 0) measured by the moving clock of the 


two events P, Q to be Ati 0) = At/y.Is this method of measuring relativistic time 
difference with Newtonian observations compatible with the chronometric method 
proposed above? 
Answer? using (a) The algebraic method (b) The geometric method. 
Solution 

Let Le 0) be the proper frame of the moving clock, which has (unknown) speed 
u along the x—axis of the LCF ©. 


‘Relativistic measurement means: with the use of light signals (radar method). Newtonian 
measurement means: with direct reading of the observer’s (not the absolute!) clock. 

5The algebraic and the geometric methods of solving problems in Special Relativity involving 
spatial and temporal differences are explained bellow in Sect. 5.11.3. 


128 5 The Physics of the Position Four-Vector 


(a) Algebraic solution 
We consider the events: 


A: The moving clock passes through point P. 
B: The moving clock passes through point Q. 


Because the events A, B are points of the world line of a clock they define a timelike 
four-vector AB'. We write the coordinates of the events A, B in the frames of © 
and pe in the following Table: 


+ 
x =i 
A: (cta, XA) (ctf, 0) 
B: (c(ta + At), xa + dap) | (ct + Att), 0) 
a _ ‘ + 
AB: (cAt, dag) (cAt’, s+, 


The boost relating Bs x gives: 


d(apy = yuAtt 
At = yAt?. 


Eliminating Att we find: 


— MAB) 


ae (5.7) 


which is compatible with the Newtonian measurement of the speed of the moving 
clock in &. The second relation gives: 


Att = — 
Y 
where y is computed from the speed u which is measured with Newtonian methods 
(equation 5.7). This relation is similar to equation (5.5) and leads us to the 
conclusion that with the proposed method observer & is able with Newtonian means 
and methods to determine relativistic temporal differences of a moving clock. 


(b) Geometric solution 


We consider Fig. 5.3: 
Explanation of the Figure 


5.5 Timelike Position Four-Vector: Measurement of Temporal Distance 129 


Fig. 5.3 Newtonian method 4 
of measurement of temporal 
differences = 
| 2 5 ne B 
cAt cAtt 
A 
ot db 
> 

Events 


A,B: Events of “moving” clock. 

x: World line of observer &. 

xt: World line of the “moving” clock. 

@:  Rapidity of the “moving” clock wrt observer &. 


Data 

(AB): Temporal interval of the events A, B measured relativistically by observer 
x, (AB) =cAt. 

(AB): Temporal interval of the events A, B measured by Newtonian methods by 
the proper observer of the moving clock, (AB,) = c At*. 


(B,B): Space distance of the events A, B as measured with Newtonian methods 
by &. 


Requested 
Relation between (AB), (AB}). 
From the triangle AB, B we have: 


(AB,) = (AB) cosh¢ = y (AB) 
from which follows: 
At = y Att 


the same as before. 
Again we note that: 


dap = (BB\) = (AB)sinh¢ = (AB) By =u At 


as expected from the Newtonian measurement. 

In Fig. 5.4 it is shown the measurement of the temporal difference AB of a clock 
from two RIO X, and 2 with speeds v1 < v2 respectively wrt the clock (i.e. X*). 
We note that t; < fo, that is, the faster observer measures a larger time difference 


130 5 The Physics of the Position Four-Vector 


Fig. 5.4 Change of value of 
measured temporal difference 
for different speeds 


B 


TO 


bo |Q* 


and as v2 — c, t2 — oo. It is important to note that the normals are on the world 
line of the observer &1, “2 and not on the world line of the “moving” clock. 


5.6 Spacelike Position Four- Vector: Measurement of Spatial 
Distance 


Consider two events P, Q which define a spacelike four-vector PQ! and let D~ 
be a characteristic frame of PQ’. In =~ the events P, Q occur at the same time 
and the four-vector PQ! has components (0, PQ”) where PQ™ is the vector 
giving the relative spatial position of the events P, Q in X&~. We consider a RIO 
x, which moves wrt &~ with parallel axes and relative speed Bc. In order to 
“measure” the relative spatial position of the events P, Q in & we must associate 
two corresponding events which are simultaneous in ©. To achieve that we consider 
the hyperplane which is normal to the world line of the observer & at the point Q. 
This plane contains all events which are simultaneous with Q. The world line of X— 
which passes through the point P, intersects the hyperplane at a point P;, hence this 
is the simultaneous event of Q for &. We define the spatial distance PQ» of the 
events P, Q for observer & to be the spacetime distance of the events Q, P; (see 
Fig. 5.5). 

To quantify the above we consider the (hyperbolic) orthogonal triangle Q P; P 
and take: 


PQs- _ |PQ™| 


P. = = 
1s cosh @ y 


(5.8) 


In order to understand the physical meaning of (5.8) we note that PQ» is the 
spatial distance of the events P, Q measured in the Newtonian way in X~ and PQ» 


5.6 Spacelike Position Four-Vector: Measurement of Spatial Distance 131 


Fig. 5.5 Relativistic A 
measurement of space Pr 
distance 


zs 
Kéle 


is (by definition!) the relativistic measurement of the same quantity in &. Then 
equation (5.8) can be understood as follows: 


Relativistic Newtonian 
Measurement of the Measurement of the 
Spatial Distance ==> Spatial Distance : (5.9) 
of the events P, O Y | of the events P,Q 
by the RIO & by the RIO x7 


Because y > | the PQy- > P,Q» therefore PQy- is the maximal value of 
the spatial distance of the events P, Q among all RIO. This relation has been 
named length contraction where length means Euclidean length. The term “length” 
creates confusion because the concept “length” is absolute in Newtonian Physics 
and Euclidean Geometry, therefore it appears to be absurd that the spatial distance 
of two events is different for relatively moving observers. However this is not so, 
because equation (5.8) involves different quantities. In the rhs PQy~- is Newtonian 
length measured by the Newtonian method (of superposition) and by observer X— 
whereas in the lhs the quantity P; Q» is the spatial (Euclidean) distance measured 
chronometrically by the observer &. The difference between these two lengths is 
due to the different method of measurement. 
We arrive again at the question: 


Is it possible for a Newtonian observer (as we are!) to measure the relativistic spatial 
difference of two events in spacetime? 


The answer is affirmative and it is given in the following example. 


Example 5.6.1 Consider a rod which slides along the x—axis of the LCF & with 
(known) constant velocity u = wi. In order for the observer © to measure 
experimentally the length of the rod, it applies the following procedure. First, it 
measures the time period Af required for the two ends of the rod to pass through 
a fixed point along the x—axis. Subsequently, it multiplies with the speed of the 
rod in & and calculates the length L of the rod in &. Then & defines the “length” 


132 5 The Physics of the Position Four-Vector 


L~ of the rod in a characteristic RIO &~ to equal y L. Is the above procedure of 
measurement of spatial distance of events compatible (as far as the numeric results 
are concerned) with the relativistic? Answer using (a) The algebraic method (b) The 
geometric method. 


(a) Algebraic solution 


Let A, B be the end points of the rod and let x be the coordinate of the 
observation point along the x—axis of the LCF of &. We consider the events P, Q 
to be the passage of the respective ends of the rod from the observation point. The 
coordinates of the events P, Q in the LCF & and &~ are shown in the following 
Table: 


x im 
P: (ct4, X) (ct, , 0) 
Q: (c(ta + At), x) | (c(t, + At), L7) 
PQ': | (cAt, 0)s (cAt~, L~)y- 


where L~ is the length of the rod in &~. From the boost relating © and X~ we 
obtain: 


L7 =yuAt. (5.10) 


All quantities in the rhs of (5.10) are measurable by the RIO & therefore the 
length L~ can be computed in & only from the Newtonian measurement of the time 
difference At. Equation (5.10) can be written differently. The Newtonian distance 
L of the events P, Q in & is given by the relation: 


L=uAt. 
Replacing u At in (5.10) we find: 
L- 
L=— (5.11) 
Y 


which is compatible with (5.8). We conclude that the above method of estimating 
the spatial distance of events in & using Newtonian methods is compatible with the 
relativistic one. 


(b) Geometric solution 


We consider the spacetime diagram of Fig. 5.6: 
Explanation of the Figure 


5.6 Spacelike Position Four-Vector: Measurement of Spatial Distance 


133 


Fig. 5.6 Measurement of the 4 
length of a rod Py rai 
@ fe 
et 7 B, @ 
L By 
vets A, 
o 
am x 
B A 


Events 


Aj, By: Observation of the ends A, B of the rod in & 

x: World line of observer 

A, B: World lines of the end points of the rod 

X14, Utz: Proper spaces of & at the proper moments ct,, cfg. 
@: rapidity of the rod in &. 


Data 
(B, Bo):  relativistically measured length of the rod in X&, (B; Bz) = L. 


(B2 B3): Newtonian measurement of the length of the rod in X~, (B2B3) = L 


Requested 
Relation between (B; Bz) and (Bz B3). 
From the (hyperbolic) orthogonal triangle B, Bz B3 we have: 


(Bz B3) 


(B, Bo) = ecig 


from which follows L = a as before.® 


Figure 5.7 shows the measurement of the spatial distance of two events P, Q by 
two RIO X and X with speeds vy < v2 respectively wrt the characteristic observer 
x” of the rod. We note that Ly < Ly, that is the faster observer measures a smaller 
spatial distance and as v2 — c the length Ly — O. Furthermore we note that the 
normals are on the world lines of the observers &;, Xz and not on the world line of 


the characteristic observer X~. 


®In Fig. 5.6, the length L appears larger than L~ . This is due to the fact that we picture a hyperbolic 
triangle in a Euclidean plane. However, as we remarked previously, diagrams like this are schematic 


devices which are used only for computational purposes. 


134 5 The Physics of the Position Four-Vector 


Fig. 5.7 Variation of the Bo 
spatial distance of two events 
with velocity 


By 


5.7. The General Case 


We have not yet covered the general case 5 of Table 5.2, in which all four 
components dl, dl’, dr, dr’ of the position four-vector in the LCF of the RIO © 
and &’ do not vanish. We do that in the following definition, which incorporates 
the considerations of Sects.5.5, 5.6 and defines the correspondence between the 
Newtonian and the relativistic spatial and temporal distances. 


Definition 5.7.1 


(a) Let the events P, Q which define a timelike position four-vector. The value 
of the Lorentz distance of the events P, QO (which is a relativistic invariant 
therefore has the same value for all RIO!) as measured in the proper frame ny O 
of the events P, Q coincides (numerically!) with the Newtonian time difference 
(that is with the time difference measured by the cosmic clock) of the events 
as measured by the Newtonian Inertial Observer who coincides with the RIO 
= 

(b) Let the events P, Q which define a spacelike position four-vector. The value of 
the Lorentz distance of the events P, Q as measured in a characteristic frame 
D7 of the four-vector P Q! coincides (numerically!) with the Newtonian spatial 
distance measured by a Newtonian Inertial Observer who coincides with the 
RIO x. 


In the following the method of measurement of relativistic spatial and temporal 
distances we call chronometry of a pair of events. 


Example 5.7.1 Two rods 1,2 of length J; and /2 = nl; respectively move with 
constant speed along parallel directions and in the opposite sense. An observer © 
at one end of the rod 1 measures that it is required a time interval tT) = 1, /k (k > 1) 
for the rod 2 to pass in front of him. Calculate: 


(a) The relative speed of the rods 
(b) The time interval measured by the observer 2 at one end of the rod 2, required 
for the rod | to pass in front of him. 


5.8 The Reality of Length Contraction and Time Dilation 135 


Solution 

Let A, B be the events of observation of the ends of the rod /2 as it passes in front 
of the observer 1 positioned at the one end of the rod /;. We have the following 
table for the coordinates of these events in the LCF of the RIO 41, Xo: 


2 x2 
A: (0, 0) (0, 0) 
B: (ct, 0) (ct, 12) 


AB: | (ct, O)x, | (cT2, Lyx, 


where T2 is the time interval required for the rod /; to pass from one end of the rod 
ly. We have t2 = a where cf is the relative speed of the rods. The boost relating 
X 1, Uo gives T2 = yT, hence: 


1D) nly nk 


b= = = = . 
iat as aed cT] cl /k Cc 


Using the identity y* — 1 = y** we find eventually: 


_ 7k b= nk 
Ge c2 ” C2 4 2k 


(b) The result does not change if we interchange the names of the rods | —> 2. 
Therefore without any further calculations we write: 


hook 


Q=— = —. 
cBy nk 


5.8 The Reality of Length Contraction and Time Dilation 


It is natural to state the question: Does the phenomenon of length contraction and 
time dilation exist? The answer to that question is crucial in order to avoid confusion 
especially with the various paradoxes which, as a rule, are concerned with these two 
phenomena. 

Definition 5.7.1 relates the measurement of the relativistic spatial and temporal 
distance of two events, with the corresponding Newtonian measurement of these 
quantities. Verbally it can be stated as follows: 


136 5 The Physics of the Position Four-Vector 


Time difference d/+ Newtonian measurement of the 
of events P, Q measured | _ | time difference of the events P, Q taken 
relativistically in the 7 by the Newtonian Inertial Observer 
proper frame bat 0 who coincides with ah 0 
Spatial distance dr, Newtonian measurement 
of events P, Q measured of the spatial distance 
relativistically in the = of events P, Q taken 
characteristic frame by the Newtonian Inertial Observer 
up 0 who coincides with X, 0 


With this definition the relativistic measurements/observations of the spatial and 
temporal distance from the characteristic observer and the proper observer respec- 
tively attain a Newtonian “reality”. However this is true only for these observers! 
For the rest RIO there is not “Newtonian reality” — that is comparison with a 
corresponding Newtonian physical quantity — and one must use the appropriate 
Lorentz transformation to estimate the value of spatial distance and time difference. 
Therefore relations (5.5) and (5.8) must be understood as follows: 


Chronometric measurement Newtonian measurement 
of time difference _ of time difference 
of the events P, Q ee of the events P, Q 
by the RIO & in the proper frame poe 0 
and: 
Chronometric measurement Newtonian measurement 
of spatial distance I of spatial distance 
of the events P, Q Y of the events P, Q 
by the RIO & in the proper frame Xp me) 


Therefore the validation or not of the phenomena of length contraction and 
time dilation in nature does not concern the validity of Special Relativity as a 
theory (=logical structure) but the chronometry, that is, the proposed method of 
measuring/estimating Newtonian spatial distances and Newtonian time differences. 
Observation has confirmed that the above relations are true, therefore validates the 
chronometry of the two events. In practice this means that if we produce in the 
laboratory a beam of particles at the point A and wish to focus them at the point 
B, then the Euclidean distance (A B) must be determined relativistically. That is the 
position of the point B depends on the energy (speed) of the particles of the beam, 
which is not the case with Newtonian Physics. In a similar manner we determine 
the life time of a given unstable particle. In the laboratory the experimentally 
measured life time of an unstable particle depends on the speed of the particle 


5.9 The Rigid Rod 137 


contrary to the Newtonian view that the life time of a particle is unique. According 
to our assumption the unique value of the Newtonian approach coincides with the 
relativistic measurement of the life time of the particle only in its proper frame 
(photons excluded, they are stable anyway). 

From the above we conclude that the phenomena of length contraction and time 
dilation are real and must be taken into consideration in our (obligatory!) Newtonian 
measurements in the laboratory and/or in space. 

The dependence of the spatial distance and the time difference of two events from 
the velocity of the particles must not worry us in our everyday activities. Indeed, 
as we have already seen, the effects of relativistic kinematics are appreciable and 
show up at high (relative) speeds, far beyond the speeds of our sensory capabilities. 
Problems which refer to cars entering a garage while traveling with speeds e.g. 
0, 9c concern “realities” limited at the level of student exercises and not further than 
that. 


5.9 The Rigid Rod 


The rigid rod is a purely Newtonian system which lies in the roots of Newtonian 
theory and expresses the absoluteness of three dimensional space. The rigid rod does 
not exist in Special Relativity, in the sense that spatial distance is not a relativistic 
physical quantity. However in practice and in the laboratory we do use rigid rods 
(e.g. the standard meters) to measure distances, therefore it is important and useful 
that we shall define the concept of rigid rod in Special Relativity. This “relativistic” 
rigid rod we shall associate via a correspondence principle with the Newtonian rigid 
rod, in the same way we did in Sect. 5.7 for the spatial distance and the time interval. 


Definition 5.9.1 The (one dimensional relativistic) rigid rod in Special Relativity 
is defined to be a (relativistic) physical system which in spacetime is represented by 
a bundle of timelike straight lines such that: 


(a) All lines are parallel (in the spacetime sense) 
(b) All lie in the same timelike plane. 


The system (relativistic) rigid rod is characterized by a single number (Lorentz 
invariant!), which we call rest length or proper length of the rigid rod. This 
number is specified by the following procedure. 

Each line in the bundle of the timelike straight lines defining the rigid rod can 
be associated with the world line of a RIO. If this is done then one can consider the 
(relativistic) rigid rod as a set of RIO which have zero relative velocity.’ 


7These observers are called comoving observers of the rigid rod. With a similar token one studies 
the motion of a relativistic fluid. 


138 5 The Physics of the Position Four-Vector 


Fig. 5.8 Relativistic 
measurement of the length of r P / i 
arigid rod : @1 Un 
TAQ 
T P Q sy) 
/ TAP 7 
xy A B 


Let A, B be the limiting outmost worldlines of the bundle of the timelike lines 
defining the rigid rod (see Fig. 5.8). Consider a RIO & and let &, be the proper 
space of & at proper time t. This plane intersects the bundle of the lines defining 
the rigid rod at line PQ say. The points P, Q are the intersection of the outmost 
RIO A, B of the rod with &,. The four-vector P Qi is spacelike because it lies on 
the spacelike hyperplane ©,. The length of the four-vector PQ! we call the length 
of the (relativistic) rigid rod as measured by the observer &. 

The length (PQ) is uniquely defined for each RIO & because: 


1. It is defined geometrically by the intersection of two planes, each one uniquely 
defined by the rigid rod and the observer. 

2. The length of the rod is independent of the proper moment t of & because & is 
a RIO and all proper spaces are parallel, therefore the length of the intersection 
with the two dimensional plane defining the rigid rod is independent of the point 
at which we consider the proper plane. 


From Fig.5.8 we note two important properties concerning the length of a 
(relativistic) rigid rod: 


¢ The length is not absolute, that is the same for all RIO, and varies with the speed 
of the rod. 

¢ The events P, Q are not simultaneous for the outmost observers A, B of the rod 
(ta,p # Ta,g and similarly for B) while they are so for the observer &. 


The RIOs for which the value of the distance (P Q) takes its maximum value we 
call characteristic observers of the rigid rod. Obviously the world lines of these 
observers are parallel to the world lines A, B which means that they have zero 
velocity relative to the rod. The characteristic observer whose world line has the 
same spacial distance from the world lines A, B we call rest observer® of the rigid 
rod and denote by Xo. 


8The rest observer is the unique characteristic observer with the property that if it (recall that we 
do not use he/she) emits simultaneously two light signals towards the end points of the rod and 
these signals are reflected at these points, then it will receive back the two signals simultaneously 
and along antiparallel directions. 


5.10 Optical Images in Special Relativity 139 


Fig. 5.9 Length of a 
(relativistic) rigid rod and 
Lorentz transformation 


If the length of the rigid rod measured by a characteristic observer is Lg and as 
measured by a RIO &, with rapidity @ wrt the characteristic observer is L, then the 
following relation holds between the two lengths (see Fig. 5.9): 


Lo Lo 


cosh¢ ~ yo 


Because y > 1 = L < Lg from which follows that the characteristic observer 
measures larger length for the rigid rod and in fact the maximal length (length 
contraction). 


5.10 Optical Images in Special Relativity 


The considerations of the previous sections where concerned with the temporal 
and the spatial components of the position four-vector and their relation with the 
corresponding concepts of time interval and space distance of Newtonian Physics. 
There is one more Newtonian physical quantity, the creation of optical images, 
which is related to the position vector but not with the measurement of the time 
difference and space distance because it does not involve the Lorentz transformation. 
The transfer of this quantity to Special Relativity is important although it applies 
only to extended luminous bodies, which as a rule do not have practical applications 
in Special Relativity. 

In order to define the concept “appear” (= create optical image) in Special 
Relativity we generalize the corresponding concept of Newtonian Physics. In 
Newtonian Physics when we “see” a luminous object we receive at the eye (or 
the lens of a camera) simultaneously photons from the various parts of the body. 
Due to the finite speed of light and the different distances of the various points of 
the body from the eye the photons which arrive simultaneously at the eye must be 
emitted at different times. If the luminous body moves slowly wrt the eye then we 
may assume that the photons are emitted simultaneously from all points of the body 
(Newtonian approximation). In this case we have a “faithful” depiction of the body. 
But for relativistic speeds this approximation does not hold and one should expect 


140 5 The Physics of the Position Four-Vector 


distortion of the optical image. Due to the above we define the creation of optical 
images in Special Relativity as follows. 


Definition 5.10.1 The optical image of a set of luminous points at one space point 
and at one time moment of a RIO & is created from the points of emission, which 
are simultaneous for &. 


Before we continue we need the following simple result. 


Exercise 5.10.1. Let = and X! two LCF with parallel axes and relative speed B. 
Consider a four-vector A' which in the LCF X% and &’ has components A' = 


0 0’ 
(4 ) and A' = e ) . Show that: 
A x A >a 
1 


peer S24) (5.12) 
Bp! yl . 


where Aj = a(A - B), Ai = BA’ - B). 
[Hint: Consider the Lorentz transformation which connects X, &’: 
Ao = y(AY + B-A) 


A= AB AIB + YAY 


and in the second equation (a) substitute A°’. Then project parallel to B.] 


Suppose a photon is emitted from a point in space (event A) and the same photon 
is received at another point in space (event B). Let & be a RIO for which the events 


ct ct ak 

A, B have components: - é ca . The four vector AB’ is a null vector 
TA x TB zp 

(because concerns the propagation of a photon) therefore: 


—c’(tg — ta)? + (rp — ra)” = 0. 


We write Rag =| rg — Ya | and assuming that tg > t4 (both time moments in 1!) 
we have: 


R 
te =tat OS. (5.13) 


This relation connects in & (!) the distance covered by the photon after its emission 


in terms of the time interval after its emission. Let another RIO %’ which moves 
4 


wrt & with parallel axes and arbitrary constant velocity. Let (“") be the 
y’ 


J 
TA 


5.10 Optical Images in Special Relativity 141 


components of the event A in &’. According to the Example 5.10.1 for the zeroth 
component of the four-vector A’ we have: 


1 


1 / 
Aral — ray) (5.14) 


cta = 


where r4\, = ar - B), ry = Br" - B), B = |B| ¥ 0. From (5.13) and (5.14) we 
have the relation which connects the spatial position of the event A of emission in 
D’ with the time and the spatial distance covered by the photon in =: 


1 1 
ctp = S(ray — —r4) + Rap > 

B l| y All 
ray = Y (ray — Bete + BRaz). (5.15) 


This relation gives the distance of the image ray in X’ as a function of the 
distance rg in X. 

Normal to the direction of the velocity B the Lorentz transformation has no effect, 
hence: 


Pat 9 is (5.16) 


Using relations (5.15) and (5.16) it is possible to compute the image in & if we know 
a luminous object in X’. Indeed let A, I be two points of a luminous body which are 
observed at the point B of & at the moment of time fp (of &). Then for each point 
equations (5.15) and (5.16) hold separately and the condition of the creation of the 
images of the points A, I in & is that the time in the rhs of the equations is common 
and equal to tg. Subtracting equations (5.15) for the points A, I’ and eliminating the 
time fo we find the equation: 


ray — ry = (ra — ry + B(RaB — Rr) (5.17) 


which expresses the “parallel” spatial distance of the points A, I as it is seen by 
the observer & at the point B of X. We note that this expression is independent 
of the time of observation in & hence the image parallel to the relative velocity 
remains the same, it is “frozen” as we say. In addition we note that for c > oo the 
ra - aa = ral —rry that is we have the absolute depiction of Newtonian Physics. 

Without restricting generality one may consider the point B to be the origin of 
the LCF & in which case Ri = i, = cw + y4 +27 and Re = re = xe + yz +22, 
If in addition we restrict our considerations to boosts, then rj 4 = x, r A= XA and 
equations (5.15) and (5.17) read: 


x', = (xa — Betg + BRa) (5.18) 
xp = y(ar — Betg + BRr) > (5.19) 
x —xp = ylxa — xp + B(Ra — Rr)] (5.20) 


142 5 The Physics of the Position Four- Vector 


Fig. 5.10 Relativistic 
observation of an an optical 
image 


This relation holds if ©’ moves wrt © with speed B. If the velocity of X’ is —B 
the relation remains the same with the change that the dashed quantities have to be 
interchanged with the unprimed, that is: 


xa —xp =y [x4 —xp + B(R4 — Rp]. (5.21) 


Example 5.10.1 A rod AC of length /g is resting parallel to the x’—axis and in the 
plane y’x’ of the RIO X’ while X’ is moving in the standard configuration along the 
x—axis of the RIO & with velocity B. Calculate the length / of the image of the rod 
on a film which is placed at the origin O of & (Fig. 5.10). 
Solution 

Let: 


| x44 — X¢ |= lo, | x4 — xc [= 1. 
From (5.20) we have: 
Ixy — Xcl = viva — xc + B(Ra — Ro) Il- (5.22) 


We consider two cases depending on whether the rod is resting along the x’- axis or 
not. 


(A) The rod is resting along the x’—axis hence y’ = y = 0. 


We consider two cases depending on whether the rod is approaching or moving 
away from the origin O of &. In the first case we have (L the proper length of the 
rod in &’) |x/, —x¢| =1o, xa — xc = Ra — Rc = —l_, B < O and replacing in 


(5.20) we find: 
_ f+ 
lo => 1-,s° >. (5.23) 


5.10 Optical Images in Special Relativity 143 


Fig. 5.11 Variation of the 
apparent length with velocity 


_ Lirorentz = a 
\ 
1 04 0 0.4 1 
When the rod is moving away we have |x’, —x¢| = lo, xa—Xc = Ra—Re = —l4, 


B > 0 and (5.20) gives: 


ine a aie (5.24) 
1+ 


Figure 5.11 shows the variation of the lengths /,, /_ for various values of 
the velocity of the rod. In the same Figure it is also shown the standard Lorentz 
contraction (/9/y) for comparison. We note that while the Lorentz contraction is 
independent of the direction of motion (hence symmetric about the value B = 0) 
the situation is different for the “apparent” length of the rod which behaves as a 
“wave” (because equations (5.23) and (5.24) express the Doppler effect). Note that 


iy PAS 
De Pe aie Sp 
y yvV1+6 

lo. . 1 112 

ese ey =a piLek. 
y yvV1-6 


(B) The rod is moving so that y’ = y= c; £0. 


In this case equation (5.20) gives: 
lo 
l= ie — B(Ra — Rc). (5.25) 


If we set 6 the angle of the position vector of the edge point A of the rod with the 
x—axis in & and 0 — A@ for the other end B, relation (5.25) is written as follows: 


1 1 1 
=~ |t0+ by (Soap =a)): oon 


144 5 The Physics of the Position Four-Vector 


When x > 6 > 4% the rod approaches the origin O of © and when 5 > 6 > 0 the 
rod is moving away from the origin of &. 

In order to get a feeling of the result we consider the limit AO < 1 (equivalently 
1 < Rag) in which case we can write (prove it!) for the difference R4 — Rc = 1 
cos @. In this case (5.25) becomes: 


lo 


We infer that in this case the observer sees the standard Lorentz contraction only if 
the observer looks along the direction of the y—axis (that is 90 = a 


Example 5.10.2 A right circular cylinder? of radius R and length Lo is rotating 
with constant angular speed w about its axis, which coincides with the x—axis of 
the LCF &X. Let X’ be another RIO which moves in the standard configuration along 
the common x, x’-axis with speed B. 


a. Prove that in X’ the cylinder appears to be a right circular cylinder of radius 
R and length “2, 

b. Assume that along the surface of the cylinder have been marked some points (for 
example with the fixing of small flags whose height is negligible compared to the 
radius of the cylinder) which in the RIO & appear to be along a generatrix of the 
cylinder. Prove that in ©’ these points do not appear to be along a directrix of 
the cylinder, but along a spiral which appears from one end of the cylinder and 
disappears at the other end of the cylinder. 


Solution 


a. The equation of the surface of the (right circular) cylinder in © is y? + z? = R?. 
The boost relating ©, &’ gives y’ = y, z’ = z. Therefore in X’ the equation of 
the surface is y’* + z’* = R?, which is the surface of a right circular cylinder of 
radius R. For the remaining two coordinates (ct, x) and (ct’, x’) the boost gives: 


x’ = y(x — ut) 
ct = y(ct’ + Bx’) 


from which follows x’ = +x — ut’. Let A, B be the points of intersection of the 
base planes of the cylinder with the x—axis. The coordinates of these points in & 
are 0, Lo. At the moment t’ of X’ these points have coordinates whose difference 
ee 5 Lo which is the expected Lorentz contraction and at the same 


time the height of the cylinder in D’. 


°This problem has been proposed by the German physicist von Laue at the early steps of Special 
Relativity. 


5.10 Optical Images in Special Relativity 145 


b. The points of a directrix of the cylinder in & have coordinates: 
x, y=Rcosat, z= Rsinat 
and velocity: 
v= (0, —Ro@sinwt, Rwcosat). 


We compute the corresponding quantities in =’. The coordinates are: 


1 
x =—x-—ut 
y =y = Roeosat = Rcoswy(ct’ + Bx’) = Rcos (Sr + Pe) 
y c 
{== Reino = Rsinewy (ct + Bx’) = Rsin( 21’ + Ps), 
Y c 


In order to define how &’ ‘sees” the marked points we consider t/=constant. From 
the equation of the transformation it follows that the marked points lie along a 
(right circular) helix of radius R and pace ioe The helix develops all along the 
length of the cylinder and it is “frozen” during the motion because both its radius 
and its pace are independent of t’. The effect of the rotation is that the helix 
appears in X’ to emerge from the nearest end of the cylinder and disappear at the 
other end (see Fig. 5.12). 

b. In order to calculate the velocity of the marked points of the cylinder in X’ we 
take the derivative! of the coordinates of the points in the LCF ©’ with respect 


to tr’. The answer (assuming x =constant) is: 


R R 
v= ( U, = sin (2 + es), © cos (Gr+ fx)), 
Y Y c ¥ Y c 


This motion can be considered as a combination of two motions. A translational 
motion along the common x, x’ — axis with speed —ur and a rotational motion 


Fig. 5.12 The image of A 
marked points in ©! 


/1\, J. 
a 


'0One can also use the relativistic rule of composition of velocities. 


146 5 The Physics of the Position Four-Vector 


about the same axis with angular speed w’ = ae (Question: What happens when 
the speed u of X’ is such that w = y?). 


5.11 How to Solve Problems Involving Spatial and Temporal 
Distance 


In this section we present a number of methods for the solution of problems 
involving the application of the Lorentz transformation — mainly boosts — in 
problems involving spatial and temporal distance. These methods can be used for 
calculations in the laboratory and especially in the discussion of the paradoxes 
which are based in the misunderstanding of the relativistic spatial and temporal 
distance and their relation to the corresponding Euclidian quantities. In any case we 
strongly suggest that the material of this chapter should be studied thoroughly by 
those who are not experienced with Special Relativity. 


5.11.1 A Brief Summary of the Lorentz Transformation 


Before we proceed we collect the results we have so far for the Lorentz transforma- 
tion. 

Consider two LCF © and ©’ with parallel axes and relative velocity u. The 
(proper) Lorentz transformation L+,+ which relates the physical quantities of & and 
x’ is (see (1.47)): 


y —yp' 
Lag 7 (5.28) 
yh 1+) ppt 


For example for the position four-vector (/, r) the transformation (5.28) gives: 


r=r+[Gte-n-vilB 
(5.29) 


!=y@-B-n). 


The vector form of the Lorentz transformation (5.28) and (5.29) holds only if 
the axes of & and &’ are parallel (in the Euclidian sense) in the same Euclidian 
space. Relations (5.28) and (5.29) are not quantitative and, in general, can be used 
in qualitative analysis only. In case we want to make explicit calculations we have 
to employ a coordinate system and write these equations in coordinate form. 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 147 


A special and important subclass of Lorentz transformations are the boosts. This 
is due to the fact (see (1.43)) that a general Lorentz transformation can be written 
as a product of two Euclidian rotations and one boost. Two LCF are related with a 
boost of velocity B along the common x— axis!! if and only if: 


1. The axes x, x’ are common and the axes y, y’, z, z’ are parallel 

2. The speed u of &’ relative to X is parallel to the common axis x. 

3. The clocks of © and ©’ have been synchronized so that when the origins coincide 
(thatisx = y= z= x' = y’ = 2’ = 0) the clocks are reset to zero (that is 


t=t'=0). 


The boost relating the coordinates (/, x, y,z) and (/’, x’, y’, z’) of an event P, 
say, in = and ©’ reads: 


x’ = y(x — Bl) 

yoy 

gaz (5.30) 
I =y(— Bx). 


The inverse Lorentz transformation from ’ to » is: 
x = y(x' + Bl’) 
yoy’ 
22 G31) 
l= y(l' + Bx’) 


that is, the sign of 6 in (5.30) changes. The matrix form of the transformation is: 


y —-yB 00 
= 00 

L+(B) = - 4 os (5.32) 
0 0 01 


This matrix is symmetric. This is not the case for the matrix representing the general 
Lorentz transformation (i.e. not parallel axes and arbitrary relative velocity) 


'lOf course boosts in any other directions can be defined similarly. In the following in order to 
save on writing the boost along the x—axis we shall call the “typical configuration”. 


148 5 The Physics of the Position Four-Vector 


5.11.2 Parallel and Normal Decomposition of Lorentz 
Transformation 


As we have seen the Lorentz transformation has no effect normal to the direction of 
the relative velocity u while parallel to the direction of u the action of the Lorentz 
transformation is a boost. This observation leads us to consider the following 
methodology of solving problems in Special Relativity. Consider a four-vector 
A; = (Ao, A) (not necessarily the position vector) and let u be the relative velocity 
of two LCF © and &’. We decompose in © the vector A along and normal to u: 


aD A, =A-A, 


alee et waz, 


and write in an profound notation: A; = (Ao, Aj, A_)». Suppose that in x’ the 
four-vector Aj has components Aj = (Ao, Aj, A’, )s. Then the (proper) Lorentz 
transformation gives the relations: 


Ao = v(Ao — B- Ay) 
Aj = y(Ay — BA®) 
A’ =AL. (5.33) 
For the Euclidian length A? we have: 
A? = AT + AG = Al + y(Ay — BAY. (5.34) 


The angle 0 between the directions A and wu is given by: 
tang = ——. (5.35) 


The corresponding angle 0’ of A’ with the velocity u in &’ is computed as 
follows: 


ak a |A’ | = |A,| |A| sind 


= 5.36 
|Aj| =v (AoB— Ayl) ~~ v(AoB — |Al cos 6) (539) 


We note that 9 # 0’ for every 3-vector A. This result has many applications 
under the generic name aberration. 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 149 
5.11.3 Methodologies of Solving Problems Involving Boosts 


The difficult part in the solution of a problem in Special Relativity is the recognition 
of the events and their coordinate description in the appropriate LCF of the problem. 
As arule in a relativistic problem there are involved: 


a. Two events (or relativistic physical quantities) A, B (say) 

b. Two LCF © and ©’ (say) and 

c. Data concerning some of the components of the events in one LCF and some 
data in the other. With the aid of the Lorentz transformation relating © and &’ 
one is able to compute all the components of the involved quantities (usually 
four-vectors) A, B. 


For the solution of problems in Special Relativity involving space distances and 
time distances one has to assign correctly who is the observer measuring the relevant 
quantity directly (that is, using the Newtonian method) and who is the observer 
computing it via the appropriate Lorentz transformation. Concerning the question 
of measurement of space distances and time distances we have the following: 


Case 1 


Consider two events A, B, which define the timelike four-vector AB’. As we have 
shown there exists a unique LCF DA p> the proper frame of the events A, B, in which 
the Lorentz length of the four-vector equals the zeroth component (AB)*. This 
number (by definition!) equals the time difference of the two events as measured by 
a Newtonian observer in ae: In every other LCF &, which has velocity B wrt to 
ae the events A, B have (relativistic!) time difference 


(AB) = y (AB)*. (5.37) 


The time difference (AB) is not measured by the Newtonian observer in & but 
it is computed either by the Lorentz transformation relating & and ee or it is 
measured directly in & by chronometry. The value of the quantity (AB) varies with 
the frame © and holds (AB) > (AB)* (time dilation). 

Geometrically, the measurement of the temporal difference is shown in Fig. 5.13. 
Description of the diagram 


x: World line of observer 

pag World line of proper frame of the clock 

A: Newtonian observation of the indication A of the clock in Xt 
B: Newtonian observation of the indication B of the clock in &* 


(AB)*: Newtonian measurement of temporal length in ©* 
(AC): Chronometric measurement of the temporal length (AB) by &. 


150 5 The Physics of the Position Four-Vector 


Fig. 5.13 Relativistic B 
measurement of the temporal 
difference 


Newtonian 
Measurement 


Lo 


x 


7 C hronometry 


cosh¢ = ¥ 


Concerning the calculation, from the triangle ABC we have: 


(AC) = (AB)* cosh¢ = y (AB). 


Case 2 


Consider two events A, B, which define a spacelike four-vector AB ‘ and are studied 
in the LCF ©. Let =~ be the characteristic LCF of the spacelike four-vector AB! 
and let Lp = (AB)~ be the Lorentz length of this four-vector. We have defined 
(AB) to be the spatial distance of the events A, B as measured by a Newtonian 
observer at rest in &. If the (relativistic) spatial distance of the events A, B in & 
is L = (AB) (measured chronometrically) and the relative speed of ©, X~ is B, 
Lorentz transformation gives the relation: 


ie 


L (5.38) 


The length Zo is an invariant whereas the length L is not and depends on the 
observer & (the factor 6). We have the obvious inequality L < Lo (spatial distance 
contraction or, as it has been inappropriately established, length contraction). 

The measurement of spatial distance is described geometrically in Fig. 5.14. 
Description of the diagram 


i: World line of observer 

dy: World line of end A of spatial distance 

2: World line of end B of spatial distance 

La: Proper space of & at the point A 

C: Intersection of proper space X,4 with world line X2 

AB": Common (Lorentz) normal of Xj, X2 at the event A 

(AB)~: Newtonian distance (proper distance) Lo of world lines in the character- 
istic LCF Xp. 

(AC): Spatial distance of world lines Xj and X2 as measured chronometrically 


by X. 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 151 


Fig. 5.14 Chronometric 
measurement of spatial 
distance 


Concerning the calculations, from the triangle ABC of Fig. 5.14 we find: 


(AB)~ = (AC) cosh¢ = y (AC). 


Case 3 


Consider two events A, B which define a null vector. A null four-vector does not 
have a characteristic LCF therefore it makes no sense to measure (relativistic) spatial 
or temporal distance. 


5.11.4 The Algebraic Method 


This method involves the construction of a table containing the coordinates of the 
four-vectors in the LCF 2, %’ and subsequently the application of the Lorentz 
transformation relating the data of the table. More precisely, for every pair of events 
A, B one constructs the following table of coordinates: 


x yD! 
A: (lA, XA, YA, ZA) Uys Xs Was Uy) 
B: (lp, XB, YB, ZB) (lg, xg. Vee Zp) 
AB’: |(lp—la, xp —Xa,--)5 | Up —U4, Xe — ye BY 


The two events define the four-vector AB’! whose components in © and >’ 
equal the difference of the components of the events A, B in X and D’ respectively 


152 5 The Physics of the Position Four-Vector 


and are related by the Lorentz transformation. In case ©, X’ move in the standard 
configuration with velocity 6B we have the equations: 


ly —U, =v (Up —la) — Bee — xa)] 
x, —x4, =v [ep —2xa)— Bp —la)] 


/ / 
YB —YA=YB—YA 
Zp — 24 = ZB —Za- 

Usually these equations suffice for the solution of the problem. It is important that 
before inserting the data in the table one clarifies which observer does the Newtonian 
and which the chronometric measurement of space and time distances. We call his 
method of solving relativistic problems the algebraic method. 


Example 5.11.1 A rod AB moves in the plane xy of the LCF © in such a way that 
the end A slides along the x—axis with constant speed u whereas the rod makes at all 
times (in ©!) with the x—axis a constant angle #. Consider another LCF X/ which 
moves relative to & in the standard configuration along the x—axis with speed v. Let 
m = tan@ be the inclination of the rod in ©. Calculate the corresponding quantity 
min Xd’, 
Irst solution 

We consider the end points A, B of the rod and assume that the measurement of 
the length of the rod is done in ©’. Furthermore we assume that at time t = 0 of © 
the point A is at the origin of &. These assumptions lead to the following table of 
coordinates of the events A, B: 


x ! 
A: (cta, uta, 0, 0) (ct’, x!,, 0, 0) 
B: (ctg, utg +1, ly, 0) (ct', x4, +1, I, 0) 
ABi: | (cAt, uAt tly, ly, Ox | OL, I, O)sy 


where At = tp —ta. The coordinates of the four-vector AB! in ©, X’ are related 
with the boost along the x—axis with speed v: 


Uo=y (ly +uAt — vAt) 


Lat 


cAt=y(0+=%). 
Cc 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 153 


Eliminating At from the third equation and replacing in the first we find: 


2nd solution 

Let =~ be the characteristic frame of the rod AB. We assume measurement of 
the length of the rod in &~ and find the following table of coordinates of the events 
A, B: 


x = 
A: (cta, uta, 0, 0) (ct; , 0, 0, 0) 
B: (ctg, utg +l, ly, 0) (ct, 1, Ly, 9) 


AB: | (cAt, uAttly, ly, Os | O, ly, I, O)y- 


age a’ Hy. 29 


The boost relating & and X~ gives: 


Lo = Vu (Uy tu At —u At) = wl, 


hence: 


I- 
m = 


_ = ly 
ly Yu lx 


m 

Yu 

Because © is arbitrary we write without any further calculations for ©’: 
m' 


Yul 


It follows: 


m 
Yul Yu : 


154 5 The Physics of the Position Four- Vector 


But we know that the y —factors are related as follows!?: 
uv 
Wi = Yu Yo (1-3). (5.39) 


Replacing y, we find: 


3rd solution (with relative velocities) 
We assume (chronometric) measurement of the length of the rod in & and have 
the following table of coordinates of the events A, B: 


5 
A: (cta, uta, 0, 0) | (ety, x4, 0, 0) 
B: (cta, uta +1, ly, 0) | (ct, x, 14, 0) 
AB’: | (0, Le, Ly, 0) CAi,.x5—x 0, 0) 


The boost relating X, X’ gives: 


/ / 
Xp—-X4=Vy 


At’ 


ll 
| 
oS 
Be 
~~ 
ct 


But in D’: 


i. / f /- / 
Xp—X4 =u At +1. 


'2Proof of (5.39). 

One method to prove (5.39) is to replace each y in terms of the velocity and make direct 
calculations. This is awkward. The simple and recommended proof is to use the four-velocity 
vector, whose zeroth component is yc and apply the Lorentz transformation for the four-velocity. 
Let us see this proof. The four-velocity of the event A in the LCF © and &’ has components: 


(Mc, Yu, 9, Os, (we, yuu’, 0, O)s. 


These two expressions are related with a boost as above. For the zeroth component we have: 


Vv uv 
Yul'C = Yo (me 2 Yu u) = Yu Yu € (1 = a) : 


Check the boost for the space components. The result will be used in the next solution. 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 155 


because the events A, B are not simultaneous in \’. From the relativistic rule of 
composing 3-velocities (see end of last footnote) we have: 


, u-vU 
u= 
1-8 
Replacing: 
xp —x, =, - Hea ES =yl 
B ATK 7 wy c2 7 - 
It follows: 
ly 


Replacing in the relation m’ = I',/I', one obtains the previous result. 


Example 5.11.2 Two events A, B are simultaneous in the LCF © and have a space 
distance (in X=!) ckm along the x—axis. Compute the 6 factor of another LCF ©’ 
moving in the standard way along the x—axis so that the event A has a time distance 
(i) 1/10, Gi) 1s, Gii) 10s from event B. 


Algebraic solution 
The two events are simultaneous in &. Hence we have the following table of 
coordinates: 


x D’ 
A: (ct, x4) (ct’,, x/4) 
B: (ct, xa +d) | (Cty, xp) 
AB’: | (0, d)s (—cAt"', xp — x'4) 7 


where d is the distance of the events in ©. The boost relating ©, X’ gives: 


/ 


cAt 
—cAt = y (0 Bd) = ~yBd > By = ——. 


Squaring and making use of the (useful!) identity y* 8? = y* — 1 we find: 


c2 At’? 
d2 


y=lt 


156 5 The Physics of the Position Four- Vector 


Replacing y in terms of 6 and setting d = c we get eventually: 


c At’ _ At’ 
(c2 At’? + d?)'/ VI+ Are 


For the given values of At’ we find: 


(i) For At! = 1/10s, B = 0.0995 or v = 0.0995c(® 10% c). 
(ii) For At’ = 1s, B = 0.707 or v = 0.707c(® 71% c). 
(iii) For At’ = 10s, B = 0.995 or v = 0.995c (& 99.5% c). 


We observe that the more the speed of &’ increases the more the event A delays 
(in &'!) wrt the event B. Indeed, solving the last expression for At’, we find At! = 


By = Vy =1. 


5.11.5 The Geometric Method 


For the solution of problems involving relativistic spatial and temporal distances we 
can use spacetime diagrams. We call this methodology, which is suitable mainly for 
simple problems,the geometric method and it is realized in the following steps: 


1. Events. 
Draw the world lines of the observers involved and identify the events which 
concern the problem 
2. Given quantities. 
Construct in the spacetime diagram the given space and time distances 
3. Requested quantities 
Construct in the spacetime diagram the requested space and/or time distances 
4. Solution. 
Use relations (5.30) and (5.31), which give (as a rule) direct solution to the 
problem 


As is the case with the algebraic method, when one draws the various quantities in 
the spacetime diagram, he/she must be aware of who measures with the Newtonian 
method and who with the chronometric. 


Example 5.11.3 Two spaceships 1,2 each of length 7, pass each other moving 
parallel and in opposite directions. If an observer at one end of spaceship 1 measures 
that is required time T for the spaceship 2 to pass in front of him, find the relative 
speed u of the spaceships. 


Algebraic solution 
We consider the events: 


Event A: The “nose” of spaceship 2 passes in front of the observer 
Event A: The “tail” of spaceship 2 passes in front of the observer. 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 157 


The table of coordinates of the events A, B in the LCF of the two spaceships is: 


Spaceship 1 | Spaceship 2 


A: (cta, X) (ctl,, x44) 
B: (ctp, x) (ct, + 2, x44) 
AB‘: |CT,O1 [Gj D2 


l/u is the time required for spaceship 2 to pass in front of the nose of spaceship 1. 
The boost relating the components of the position four-vector AB! gives (8 = u/c): 


l l 
=— > f= ————.. 
Py cT is VCeCTZ+ 2 
Geometric solution 
We consider the spacetime diagram of Fig. 5.15: 
Description of the diagram 
Events: 


A, B,: World lines of observers at the end points of spaceship 1 
A2, Bz: World lines of observers at the end points of spaceship 2 


Cc: The nose of spaceship 2 passes in front of the nose A, of the spaceship | 
D: The tail of spaceship 2 passes in front of the nose A; of the spaceship 1 
DZ: Proper space of observer B at the event D 

Data: 


CD: Time distance required for spaceship 2 to pass in front of the tip A; of 
spaceship 1. (CD) = Tc 

DE: Proper length of spaceship 1, (DE) = 1. 

DZ: Proper length of spaceship 2, (DZ) = 1. 


Fig. 5.15 Geometric 
measurement of the length of 
the spaceship 


158 5 The Physics of the Position Four-Vector 


Requested: 
Angle ¢. 
From the triangle CZ D we have: 


l l 
(DC) = (DZ)/ sinh 6 = I/By = By = => B= Fa 
Example 5.11.4 An astronaut on a space station sees a spaceship move towards 
him with constant speed 0.6 c. Checking in the database of the known spaceships he 
finds that the spaceship could be the Enterprize. According to the file in the database 
the length of the Enterprize is 120m and emits recognition signals every 80s. The 
astronaut measures the length of the spaceship and finds 96m. He measures the 
period of the emitted light signals and finds 100s. Is it possible to conclude from 
these measurements that the approaching spaceship is the Enterprize? 
Algebraic solution 
We consider the events: 


A: Observation of one end of the unknown spaceship 
B: Observation of other end of the unknown spaceship 


Consider © to be the space station and let X’ be the proper frame of the spaceship. 
For the (chronometric) measurement of the length of the approaching spaceship at 
the time ft of & we have that the events A,B must be simultaneous in &, therefore 
we have the following table of coordinates of the events A, B: 


x >» 
A: (ct, x4, 0, 0) (ct’,, x4, 0, 0) 
B: (ct, xa + Ax, 0, 0) | (ct, +cAr’, x4, 0, 0) 
AB’: | (0, Ax, 0, 0)s (c At’, Ax’, 0, 0), 


The boost relating the components of the four-vector AB! in © and X’ gives: 
Ax’ =y (Ax —B-0)=y Ax 


where y = (1 — 0.6)~!/* = 1.25. Replacing we find Ax’ = 1.25 x 96m = 120m, 
which coincides with the length of Enterprize given in the database. 

We continue with the identification of the recognition signal. Now the events 
A, B are the emission of successive light signals (from the same position of 
the spaceship). Let At’ the time distance of the light signals in &’ and Ar the 
corresponding difference in &. We have the following table of coordinates for the 
events A, B: 

The boost relating the components of AB! in ¥, D’ gives: 


At = y (At + BO0/c) = yAt' = 1.25 x 805 = 100s 


5.11 How to Solve Problems Involving Spatial and Temporal Distance 159 


>») LD 
A: (cta, x, 0, 0) (ct!,, x/,, 0, 0) 
B: (cta +c At, x + Ax, 0, 0) | (ct! +cAr’, x/,, 0, 0) 
AB’: | (cAt, Ax, 0, 0)5 (c At’, 0, 0, 0)y 


Fig. 5.16 Geometric 
measurement of spatial and 
temporal distances 


which coincides with the period of signals given in the database. The astronaut 
concludes that the approaching spaceship appears to the Enterprize. 

Geometric solution 

Consider the spacetime diagram of Fig. 5.16: 

Description of the diagram 


(a) Length measurement: 
Events: 


x: World line of cosmic platform 

Aj, Bi: World lines of the end points of the unknown spaceship 

A, B: Events of measurement of the space length of the unknown spaceship at 
the moment t, of the platform 

x4: Proper space of the platform at the event A 


Data: 


(BE): (Proper) length of unknown space ship, (BE) = 120m 
(BA): Measured length of unknown spaceship, (BA) = 96m 
g: tanh¢é=f6=0.6, coshdé = y = 1.25 


Requiested: 
Compatibility of the data. 
From triangle BE A we have: 


(BE) = (BA) cosh¢ = (BA) y = 96 x 1.25m = 120m 


therefore the space lengths are compatible. 


(b) Measurement of the time distance (period) of the recognition signal. 


160 5 The Physics of the Position Four-Vector 


We consider again the spacetime diagram of Fig. 5.16: 
Description of the diagram 
Events: 


Lu: World line of cosmic platform 

C,D: Events of emission of successive light signals 

C, D,: Events of observation of successive light signals from the platform 
Xp: Proper space of the space platform & at the event D. 


Data: 


(CD) = 80s (Proper) time distance of successive light signals emitted from 
the space ship. 

(CD,) = 100s Measured time distance of successive light signals in the space 
platform & 

@: tanh ¢é = B = 0.6, cosh¢@ = y = 1.25 


Requested: 
Compatibility of the data. 
From triangle CD D, we have: 
(CD,) = (CD) cosh¢ = y(CD) = 1.25 x 80s = 100s 


which is compatible with the data of the database. 


Chapter 6 m) 
Relativistic Kinematics om 


6.1 Introduction 


In the previous chapters we developed the basic concepts of relativistic motion, that 
is, the time and spatial distance. The methodology we followed was the definition of 
these concepts, in agreement with the corresponding concepts of Newtonian theory, 
which is already a “relativistic” theory of motion. 

In the present chapter we continue with the development of the kinematics of 
Special Relativity. We introduce the basic four-vectors of four-velocity and four- 
acceleration. We define the concept of relative four-vector and finally show how the 
Lorentz transformation defines the law of composition of 3-velocities. The four- 
acceleration is discussed in a separate chapter, due to the“peculiarities” and its 
crucial role in both kinematics and dynamics of Special Relativity. 


6.2 Relativistic Mass Point 


Special Relativity is a geometric theory of Physics. Therefore all its concepts must 
be defined geometrically before they are studied. The basic physical object of 
kinematics is the “moving point”, which in Special Relativity is described by a 
worldline, that is, a spacetime curve whose position four-vector and the tangent 
four-vector at all its points is timelike or null. The reason we consider two types of 
curves is because in Special Relativity there are two types of particles with different 
kinematic (and dynamic) properties, the particles with mass and the photons. The 
particles with mass we call Relativistic Mass Points (ReMaP) and geometrize with 
a timelike worldline, whereas the photons we correspond to null worldlines. In this 
section we develop the kinematics of ReMaPs. The kinematics of photons we study 
in another place. 


© Springer Nature Switzerland AG 2019 161 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_6 


162 6 Relativistic Kinematics 


A timelike worldline can be affinely parametrized.! The affine parameter is called 
the proper time t of the ReMaP. 

In order to do Physics we must associate to ReMaP relativistic physical quanti- 
ties. We assume that with each ReMaP there are associated two relativistic physical 
quantities: The position four-vector x! and the proper time t. These quantities are 
sufficient for the kinematics of the ReMaP. Concerning the dynamics, one has to 
associate additional physical quantities to a ReMaP. 

Using the physical quantities x‘, t we define new ones along the world line, 
using the two basic rules of constructing new tensors from given ones (see 
Proposition 1.4.1): Multiplication with an invariant and differentiation wrt an 
invariant. These new tensors are potential relativistic physical quantities and become 
relativistic physical quantities, either by means of a principle, or with identification 
with a Newtonian physical quantity in a specific frame. From x', t we define five 
new Lorentz tensors (two four-vectors and three invariants): 


— dxi 
——— 6.1 
ie (6.1) 

n> gag! 
a’ = ae (6.2) 
uluj, aju', aaj. (6.3) 


In order to study the new four-vectors we use special coordinate systems in which 
these vectors attain their reduced form. Because x'(t) is a timelike curve it admits 
at each point an instantaneous proper frame &* in which x! = ) y+ In this frame 


we compute: 


as well as: 


From these two relations we infer the following: 


1. The timelike vectors x’, u’ have common proper frame 

2. The length of the four-vector uw’ is the universal constant c. 

3. The four-vector u! is determined in its proper frame by c only, hence it is common 
for all particles with mass. 


' Affine parametrization means that the length of the tangent vector a has constant length along 


the world line. Proper time is the affine parameter (modulo a linear transformation) for which this 


constant equals —c*. 


6.2 Relativistic Mass Point 163 


Because u' is determined only in terms of c, which (in Special Relativity) is 
a physical quantity, we infer that u! is a relativistic physical quantity. This new 
relativistic quantity we call the four-velocity of the ReMaP at the point x! (r). 

We continue with the rest of the quantities defined by relations (6.2) and (6.3). 
Relation u‘uj = —c? => ua; = 0, which implies: 


1. The four-vector a! is normal to u‘, therefore it is a spacelike four-vector 
2. The invariant u' a; is trivial 
3. In the proper frame =t of the four-vectors x’, u! the four-vector a’ has 


components (why?): 
a= ( *) (6.4) 
av yy 


We note that D* is the characteristic frame of the spacelike four-vector a’. We 
identify in =* the 3-vector at with the Newtonian acceleration of the mass point 
in =* and then the four-vector a’ becomes a relativistic physical quantity. This new 
quantity we call four-acceleration of the worldline at the point x! (rt). Geometrically 
the four-acceleration accounts for the curvature of the worldline at each of its points, 
therefore the vanishing of the four-acceleration (equivalently of the 3-vector a‘) 
means that the worldline is a straight line in spacetime. This result is consistent 
with the requirement that the worldline of a Relativistic Inertial Observer (RIO) is 
a straight line. Furthermore we see that with this correspondence the Newtonian 
Inertial Observers correspond to Relativistic Inertial Observers (RIO) which is an 
important fact, because all observations are made by Newtonian observers (i.e. us). 

Based on the above analysis, we give the following important definition. 


Definition 6.2.1 (Relativistic mass point) (ReMaP) with timelike worldline A(t) 
is the set of all relativistic physical quantities, which are defined at every point of 
A(t) and have common proper frame or characteristic frame, depending on whether 
they are timelike or spacelike four-vectors. 


According to this definition, x! is the position four-vector, u! is the four-velocity 
and a’ is the four-acceleration of the ReMaP, whose worldline is the curve x!(r). 

It is to be noted that the above apply to particles with mass, not to photons 
(or particles with speed c), which are described by null position four-vectors. The 
position vectors of photons do not have a special type of frame in which they obtain a 
reduced form. The kinematics of photons is studied by means of other four-vectors. 


Example 6.2.1 Calculate the components of the four-velocity u' in the LCF, in 
which the particle has 3-velocity u. 
Solution 

Consider a ReMaP P with worldline x'(t) and let = be a LCE, in which 


ae ; ct 
the position four-vector x’ = ( ) . In the proper frame xt of P we have 
r 
x 


164 6 Relativistic Kinematics 


0 


necessarily a boost), hence: 


ct ; : ‘ 
LS ( ) . The two expressions are related with a Lorentz transformation (not 
y+ 


ct=yct >t=ytT. (6.5) 


The components of the four-velocity u' of P in ¥ are: 


. dx! — dt dx! 
yee en clas =r({) (6.6) 
x 


dt dt dt u 
where u is the 3-velocity of the ReMaP in &. We find the same result if we apply the 


‘ i c 
general Lorentz transformation (1.51) and (1.52) to the four-vector u’ = ( - : 
>t 
Exercise 6.2.1 Using the above expression for the coordinates of u' in X, show 
that in © u'u; = —c*, which reaffirms that the quantity u'u; is an invariant. 


In the following example we show how the four-velocity is computed for certain 
motions. 


Example 6.2.2. The LCF &’ is moving in the standard configuration with speed u 
wrt the LCF =. A particle moves in the x, y plane of & with constant speed v around 
a circular orbit of radius r centered at the origin of &. Calculate the four-velocity 
of the particle in X’. Calculate the y’—factor of the particle in &’. Find the relation 
between the time ¢ in © with the proper time t of the particle. Write the equation of 
the trajectory of the particle in & in terms of the proper time T. 

1st Solution 

The equation of the orbit of the particle in & is: 


ut 4 r ut 
x =rcoswt=rcos| — ]}], y=rsnat=rsin{ — }. 
r r 


Therefore the 3-velocity of the particle in ¥ is: 


: ut ut 
Vy = —vsin{ — ], vy = vcos [| — 
r r 


and the four-velocity: 


: ut ut 
Uji = (YC, YUx, VVy, 05 = (ve —yv sin (“) , YUCOS (=) .0) 
. r r >>) 


where y = 1/,/1 — 


6.2 Relativistic Mass Point 165 


In order to compute the four-velocity of the particle in ©’, we consider the boost 
relating © , X’. For the zeroth component we have: 


, Uu Uvx ; 
yC=Y%u (ve- =v) =wye (1-5) =>y =v. (6.7) 


where Q = (1-— me) and y, = 1/,/1— we For the spatial components we compute: 


Uy —U 
Vy = O (6.8) 
eee (6.9) 
: YuQ- 
Hence the four-velocity of the particle in X’ is: 
y'c Yuy Qe yy Qc 
vo af Ve | ap mvexr-™ |_| -my@sinE +0) 
Zi y'vyr Vvy yucos % 
Va 0 $3 0 I 


The proper time of the particle in © is t = £, where t is the time in \. Therefore 
the orbit of the particle in & in terms of the proper time is: 


VYT . fvyt 
x=reos(—*), y=rsin(*). 
r r 


[What do you have to say for the four-vector (ft, a a 0)>?]. 
2nd solution 

First we transform the orbit of the the particle in X’ and then we compute the 
four-velocity using derivation wrt time. The boost relating ©, X’ gives: 


/ 


u Pd / 
ct = (ct- =x), x =y(x—ut), y =y. 
Cc 


From the first equation follows: 


dt'=y, (ar = a) Ey Opes ees 
c2 dt’ Vu Q 


hence: 


, ax’ dx’ dt Uy —U -1 _ fot 

vy = = 7 = nm = vsin{| — }]+u 

, ay’ dy’ dt Vy 1 ut 

"Sa — BE dpe — i POS ek 
Yu (1 = us) ie 


The rest of the solution is the same. 


166 6 Relativistic Kinematics 
6.3 Relativistic Composition of 3-Vectors 


A basic result of Newtonian Physics, with numerous applications, is the composition 
rule for Newtonian 3-vectors. This rule relates the observations of two Newtonian 
Inertial Observers (NIO) and concerns one Newtonian physical quantity. Let us 
recall how this rule is defined. 

Consider two NIO Nj and N2 and the Newtonian mass point P, which wrt to 
Nj, N2 has respectively: 


¢ Position vector: rp,, rp, 
¢ Velocity: up,, Up, 
* Acceleration: ap,, ap,. 


Then, if the NIO N> wrt the NIO JN, has: 


* Position vector: ry,,. 
° Velocity: uy>,. 
* Acceleration: ay, 


Newtonian kinematics postulates the relations: 


rp, =Frp, + Ivy 
up, = Up, + UN,, (6.10) 


ap, = ap, + any, 


which are called the Newtonian composition rule for the Newtonian position vec- 
tor, velocity and acceleration respectively. This rule applies to any other Newtonian 
3-vector associated with the Newtonian point mass. 

We note that relations (6.10) express the Galileo transformation for the posi- 
tion, the velocity and the acceleration 3-vectors respectively. This observation is 
important, because it shows that the Newtonian composition rule for 3-vectors is 
fundamental to geometry (it expresses the linearity of the Newtonian space) and to 
Physics, because it expresses the Galileo Principle of Relativity. This explains why 
the breaking of this rule by the velocity of light necessitated the introduction of 
Special Relativity. 

Using the above observation, we extend the 3-vector (not any more Newtonian 3- 
vector!) composition rule in Special Relativity by a direct application of the Lorentz 
transformation. Working in a similar manner we consider two LCF & and %2 and 
one ReMaP P, which relatively to &1, U2 has respectively: 


1. Position four-vector: x), , x), 
Pi? *Py 
Seoul i 
2. Four-velocity: u Pi Py 
3. Four-acceleration: a‘, , a). 
Pi? “Po 


6.3 Relativistic Composition of 3-Vectors 167 


If LC, 2) is the Lorentz transformation relating &%1, “2 then the following 
relations apply: 


xb, = L(, 2)x5, 
wp, = LL, 2uh, (6.11) 
ap, = L(1, 2)a5,. 


These equations induce a relation among the 3-vectors of the spatial parts of the 
respective four-vectors. This relation we call the relativistic composition rule for 
3-vectors. This rule is different from the Newtonian composition rule, because the 
first expresses the linearity of space and the Galileo Principle of Relativity, whereas 
the latter the linearity of Minkowski space and the Einstein Principle of Relativity. 

Obviously, the detailed expression of the relativistic composition rule will 
depend on the four-vector concerned. Most useful (and important) is the relativistic 
composition rule for 3-velocities, which we derive in the next example. 


Example 6.3.1 


a. Consider two LCF ©, ©’ with parallel axes and relative velocity u. Let a ReMaP 
P which in X, ©’ has velocities v and v’ respectively. Prove that: 


1 u-v 
¥=—|v+[—m%-D-m lu] (6.12) 
YuQ Uu 
where yy = YuYv@ and Q = 1 — ¥Y, 
Replace 5 = a i : to find the equivalent form: 
= 
Yu 
1 Yu +Uu-v 
v= v+ 1 u| 6.13 
Vy O 1+ Vu c2 Yu ( ) 


In the special case the velocity u is parallel to the x—axis so that ©, &’ are 
related with a boost, show that (6.12) reduces to (OQ = 1 — ae 


Ux —Uu 
Uy! —= O 
Vy 
vy = — (6.14) 
: Yu Q 
Uz 
Vz! — 
Yu Q 


b. Using the results of a. prove that if the velocity of &’ wrt © is u, then the velocity 
of = wrt X’ is —u. 


168 6 Relativistic Kinematics 


c. Consider three LCF %, X2 and &3 such that X2 is moving in the standard 
configuration wrt the LCF & with speed u along the common axis x1, x2 and X43 
moves wrt X2 in the standard configuration with speed v along the common axis 
y2, y3. Let v3; be the 3-velocity of &3 wrt Xj and vj3 the 3-velocity of Xj wrt 
3. Show that the angle 63; of v3; with the x; —axis is given by tan(631) = oo 
and the angle 613 of v3 with the axis x3 is given by tan(0)3) = atv . Compute the 

difference AO = 613 — 63; in terms of the speeds uw, v and discuss the result. 


Solution 
The four-velocity v! of P in ©, D’ respectively is: 


vu! = (We, WwW) = We, WIV) 5y- 


These coordinates are related by the Lorentz transformation (1.51) and (1.52) 
relating X, 0’. For the zeroth component we find (Q = | — =): 


wre = Yu (we -—*) => * ~ (i " 2) = —. (6.15) 
C2 
For the spatial components we have: 
WV = y+ [om -»" _ at v¢| u 
=v \v+ [om - D> - vu | u| 
It follows: 
v= % {vt [Ou - D=S - vs] ul. (6.16) 

Replacing from (6.15) we find the required result. 


If in (6.16) we consider u = ui we obtain (6.14). 
2nd solution 
We consider first the boost for the position vector between © and ©’: 


= (: “=) 
=> Yu C2 


= Vu (x — ut) 
y=y 


f=. 


6.3 Relativistic Composition of 3-Vectors 169 


It follows: 
, dx’ dx’ dt 
. SS SSS aS SSS _ . 
* dt’ dt dt’ : dt! 


From the transformation of the zeroth component we have: 


- vdt dt 1 1 
cdt' = yy (cat all ) = (6.17) 
Cc dt Vu (1 = ur) Yu Q 
Replacing we find: 
, Uy —U 
v, = 0 


Similarly, we calculate: 


v,, = — — 

~ dt’ dt dt!’ yw 
a vz 

“Yu Q 


In the general case we have: 


, adr dyrdt dtd 
v= — — 
dt’ dt dt' dt' dt 
dt 


=f (r+ [uo 8t a] 


[r+ [ - b> = Yat u| 


Replacing ae from (6.17) (which holds for general motion with parallel axes), we 
obtain (6.12). 


b. The velocity v of © wrt itself equals 0. Replacing in (6.16) v = 0 we find v’ = 
—u. 
c. We consider &3 to be the proper frame of P and apply (6.12). Let v3y be the 


relative velocity of &3 wrt X; and v32 = (0, v, 0) wrt X2. The velocity of X1 wrt 
X2 is Vji2 = (—u, 0, 0). Relation (6.12) gives: 


dt2 Vi2 + ¥32 u 
v31 = —— 4 ¥32 + | (¥12 — 1) —3— _ — 12. | V12 ¢ = V31 = | ¥, —, 0 
dt) U19 Yu 


where we have used that: 


dtz 1 1 


dty 21 (1 = vay) Yu 


170 6 Relativistic Kinematics 


Similarly, for vi3 we have: 


dty Vi2 + V32 u 
Vi3 = =~ yVi2 + | (32 — 1) 3 — — 22 | V32¢ > Via = [——., —0, 0}. 
dt3 U35 Yo 


We note that the two 3-velocities lie in the planes x;y; and x33. Therefore, in 
order to give their direction, it is enough to give the angle they make with the axes 
x, and x3 respectively. The angle of v3; with the x;—axis is: 


v 
tan 63; = 
UVu 
and similarly, that of vj3 with the x3—axis: 
v 
tan 6,3 = ma * 
u 


In Newtonian Physics holds (because y, = yy = 1): 
631 = 643. 


In relativistic kinematics this does not hold and the two relative velocities are not 
collinear. They have a deviation AO = 63; — 0)3 as follows: 


tan63; —tan@i3, Ll — Wy 


tan AO = tan(63; — 613) = = 
(631 — 13) 1+ tan6;3tan@3, 9 u2y, + v2yy 


Replacing the lengths u*, v? in terms of the corresponding y, we find the result: 


YuYu uv 


tan AO = — = 
Yu + Yo € 


We note that in general AO + 0. Kinematically this means that the Lorentz 
transformation does not preserve the 3-directions. This does not bother us because 
3-directions are not covariant (as in the case of Newtonian Physics). Assuming 
speeds v; < c, the y; = 1+ O(B2, 62) (i =u, v) and Aé = —tan7!(B,,By/2) = 
— By By/2 +0(B2, eae This result is related to the Thomas precession, which will 
be considered in Sect. 6.6. 


In Sect. 4.5.3 we classified the relativistic particles in terms of the type of their 
worldlines and their speed. More specifically, the particles with timelike worldline 
and speed u < c we named bradyons, the particles with null worldline and speed 
u = c we Called luxons and the (hypothetical) “particles” with spacelike worldline 
and speed u > c tachyons. Because the speed is not a relativistic quantity, we have 
to prove that this classification is covariant. That is, if a photon is a photon for one 


6.3 Relativistic Composition of 3-Vectors 171 


LCF then it must be a photon for all LCF and not change to a mass particle for some 
LCF. However, the frequency of the photon can change. 


Example 6.3.2 


a. Assume the standard one dimensional configuration for the motion of a REMAP 
and show that under the transformation 


the composite velocity v’ remains unchanged. 
b. Extend the result to a general 3-dimensional motion and show that the same result 
holds for the speed of the REMAP. 


Example 6.3.3 A ReMaP P has 3-velocity u and uw’ in the LCF © and »’ 
respectively. Assuming that &’ is related with © with a boost along the common 
x, x’ axis with speed v, show that: 

a. Ifv <cthen,uw’ <c>u<candu’>c>u>c. 

b. Prove that “42 = v(v) ( _ uy), 


yu) eo 


Solution 


a. From the invariance of the length of the position vector we have: 
—c*dt? + dx? + dy* + dz* = —c?(dt')* + (dx')* + (dy' + (dz? > 
dt*(c? —u*) = dt?(c? —u”). (6.18) 
The boost relating © and &’ gives: 


dt' = y(v) (« = Fax) = dt’ = y(v)dt (1 - a) (6.19) 
Cc Cc 


Replacing in (6.18) we find: 
(2 — uw) = yw) (1 = aS) (2 —u?) =a(2 — wu?) 
where: 
w=y(v)(1— a) > 0. 


Because @ is always positive, we have that u!’ <c > u <candu’>c>u>c., 


172 6 Relativistic Kinematics 


Fig. 6.1 Relativistic uN 
composition rule for 
3-velocities 


val 


Vo< -C 


c 
b. Using (6.18) and (6.19) we find: 
cu? dt'\? u-v 
(S - 7) . (=) =H eZ = 
UAeY @) (1- =>) (6.20) 
yu)” eo) 


It is interesting to plot the speed of P in case P moves along the common x, x’ 
axis. If vj, v2 is the speed of P in the LCF %1, Xo, respectively we have: 
v2 +u 
sas (6.21) 


ee a : 
The plot of v; as a function of wu for various values of v2 is shown in Fig. 6.1. 
The graph is fully symmetric? and consists of nine branches defined by the 


parametric values 


v2<—-Cc, vw=-c, -c<vw.<0, w=0, O<v2<c, w=c, we. 


>This is consistent with the so called Reciprocity Principle which is frequently stated in Special 
Relativity and expresses the relativity of relativistic motion. 


6.3 Relativistic Composition of 3-Vectors 173 


We note that the graph divides the plane (v1, w) in five distinct regions as follows: 
vi < —-C, v= -c, —c< vj <C, Vv=Cc, vU> Cc. 


In each region we have values for the parameter v2 so that v; and v2 are always 
in the same region. For example, if the value of vj is in the region —c < vj <c, 
then in the same region is the value of v2. This result shows that the classification of 
the particles according to their speed is covariant. More specifically, bradyons are 
always in the region (—c, c), photons in the region vj; = vz = +c and “tachyons” 
in the region (c, +00) U (—o, —c). 


Example 6.3.4 A mobile is sliding freely along the positive x—axis with constant 
speed ac (0 < a < 1). Asmaller mobile is moving freely on the first along the same 
direction with relative speed ac. A third even smaller mobile moves freely on the 
second in the same direction with relative speed ac and so on. 


1. Assuming that Newtonian kinematics applies, prove that the speed cu,+, of the 
r + 1 mobile wrt the x — axis is: 


Ur+1] = Up +a (6.22) 


where r = 1,2,.... Also prove that in Newtonian kinematics the speed of the 
n-th mobile is given by the relation: 


Ung = (n+ Da. (6.23) 


Conclude that the speeds v, are members of an arithmetic series with step a. 
Finally, show that lim vy, — +00. 
noo 


2. Repeat the same questions for relativistic kinematics and show that: 


— uta 
Urs 1+av; 


— U+ta)"-d-a)" 


Yn = (aya 2) 
lim vy, = 1. 
noo 


Discuss the result. 


174 6 Relativistic Kinematics 


Solution 

The Newtonian part is obvious. Concerning the relativistic part, let cv, be the 
speed of the r-th mobile wrt the x—axis and ac the relative speed of the r + 1 
mobile wrt the mobile r. From the relativistic composition rule for 3-velocities we 
have: 


cur + ac vy +a a Ur +a 
CUr4] = =C Ur+] = ——_. 
a 1+ Cage T+ va te i ang 


(6.25) 


Consider the sequence ¢, = a r=1,2,.... We find: 


vp+a 

ct 1—v;41 = I+v-a 1+ av; —v,; —a 
1 —| — = —| 

r+ 1+ u;-41 1+ me 1l+av; +v; +a 


_ (-a)(-v,) _ l-a 
~ +ajd+v) 1l+a 


This reduction formula gives: 


which implies that the terms ¢, of the sequence are members of a geometric series 


with step € (0, 1). Obviously, this sequence converges to lim ¢, = 0, hence 
row 
the velocity lim cv, = c as expected. The first term of the series is*: 
roo 
c l-—v 1 l-a 
1 = => 
1+, l+a 
3The solution of the problem is a particular case of the sequence a, = wa , whose terms a, 


converge to a common value x. In this case, we set x = werk and compute the roots 1, 02 of the 


quadratic equation. Then the following recursion formula holds 


ar+1 — P1 = Ant — Pl 


ar+1 — p2 ar — p2 


where A is a constant. [Study the case p} = p2]. In our example a, = v, andk = 1,A=a,pu=1, 


p =a. The equation x = ats ax” +x =x+a=> x = +1 hence the reduction relation is: 
Uti — 1 Ur-1 
= AW or 41 = Ab. 
Uti tl Ur+1 
In order to calculate A, we consider the terms ¢7 = Aé, and replacing we find A = = etc. 


6.4 Relative Four-Vectors 175 


: 
thus ¢, = (152) . From the relation defining ¢, we have: 


l1-vy  (l-a r 
Lite” Micha 


_ (+a —(-ay 
~ (tay +d =ay 


from which follows: 


r 


The rest are left as an exercise for the reader. 


6.4 Relative Four-Vectors 


Another important relation among velocity, acceleration and Newtonian vectors in 
general, is the relative velocity, acceleration, and vector respectively. The concept 
of relative vector involves one observer and two mass points, contrary to the idea of 
the composition of vectors, which involves two observers and one mass point. Con- 
sequently, the concept of relative vector cannot involve the Galileo transformation, 
which concerns two Newtonian observers. Let us recall the definition of relative 
vectors in Newtonian Physics. 

Consider a NIO (Newtonian Inertial Observer) N wrt whom two Newtonian mass 
points P, Q have respectively: 


* Position vectors: rp, rq, 
¢ Velocities: vp, Vo, 
* Accelerations: ap, ag. 


Then the relative position vector PQ of the Newtonian mass point Q wrt the 
Newtonian mass point P with reference to the observer N is defined to be the vector: 


PQ =rpo =ro-rp. 
For the velocity we have respectively: 
vpo =Vo —Vp (6.26) 
and for the acceleration, 
apg = aq — ap. 


This definition expresses the linearity of the space of Newtonian Physics and it is 
different from the rule of composition of Newtonian vectors, which is a consequence 


176 6 Relativistic Kinematics 


of the Galileo transformation (and involves two observers). The fact that the two 
definitions seem to “coincide”, is due to the simplicity of the geometric structure 
of the linear space R* (something similar occurs with the real numbers, which 
practically share all properties). 

The concept of relative vector is taken directly over to Special Relativity as 
follows. Consider a LCF &% wrt which two relativistic mass points P, Q have 
respectively: 


1. Position four-vector: ae ee 
eee | i 
2. Four-velocity: vp, vo, 


3. Four-acceleration: ap, do- 


Then the relative position four-vector of Q wrt P with reference to the observer 
& is defined to be the four-vector: 


Bee es gal i 
Xpg =XQ— Xp: 


In a similar manner, the relative four-velocity i 0 and the relative four- 


acceleration ag of Q wrt P with reference to the observer & are defined to be 
the four-vectors: 


Veg = Ug — Ub (6.27) 
ag = ao _ as: (6.28) 


The definition of the relative four-vector in Special Relativity expresses the linearity 
of the Minkowski space and, as is the case with Newtonian Physics, it is different 
from the composition rule of four-vectors.* 

The most important of the relative vectors is the relative velocity, which is used 
in the study of colliding beams, in order to determine the maximum interaction 
energy. For photons, the concept of relative velocity and acceleration makes no sense 
because these four-vectors are not defined for photons. 

Consider the LCF © with four-velocity uw! and a ReMaP P with four-velocity v’. 
The inner product u!v; is an invariant, therefore its value is the same in all LCF. In 
the proper frame &* of u! one has: 


i_[{c¢ i_ (ye 
an () ~ Cook 


4The concept of relative four-vector cannot be extended to theories of Physics formulated over 
a non-linear space i.e. curved spaces (e.g. General Relativity). The reason is that the relative 
vector involves the difference of vectors defined at different points in space, therefore one has to 
“transport” one of the vectors at the point of application of the other, an operation which involves 
necessarily the curvature of the space. 


6.4 Relative Four-Vectors 177 


from which follows: 
i 2 l 4 
uve=—y(vjc > y(v) = au Uj. (6.29) 


Relation (6.29) expresses the y—factor of a ReMaP wrt an observer in terms of 
the inner product of their four-velocities. Consider now two ReMaP, 1, 2 say, with 
four-velocities vu}, v5, and relative four-velocity v5, = vj — v5. Then: 


vi, (va) = (vt, — v}) [oi — (v2) ] = —2€? — 2v} (v2);. 


Let v2; be the 3-velocity of ReMaP | as measured in the proper frame of 2. Then 
according to (6.29): 


ae 
y(v21) = ~ aati 2)i (6.30) 
hence: 
v}1(v2i)i = —2c? + 2c? y(va1) = 2c? [y(v21) — 1. (6.31) 


Relation (6.31) leads to two conclusions: 


l. yz) > 1 > vs (v2) > 0, which means that the relative four-velocity is a 
spacelike vector, hence not a four-velocity. 

2. The magnitude of v2; is determined uniquely from the speeds of the two ReMaP 
in D. 


The second result is not obvious and it will be useful to prove it. Suppose that in 
X the four-velocities of 1, 2 are: 


pa ( y(uyje ) ‘oe ( vy (v2)e ) 
v= ,U, = . 
yy) > ¥(2)¥2 ] 5 
Then in © we have: 
: vi-v 
vjvr = 7 (r)y (o2)e? + vWD (oa) (M1 - V2) = — YOoDY Orde? (I- +5). 
From (6.30) we conclude: 


y (v2) = y(v21) = yy (v2) Q (6.32) 


where Q = | — ae Replacing the y—factors in terms of 6 = u/c we find the 
symmetric relation: 


(1 — p2)( — p3) 
(1 — B, - B>)? 


a (6.33) 


178 6 Relativistic Kinematics 


We note that the right parts of the equations (6.32) and (6.33) are symmetric wrt 
the speeds v1, v2. This implies that if v2; is the relative velocity of 1 wrt 2 and vj2 
is the relative velocity of 2 wrt | then: 


lvi2| = Iva (6.34) 


that is, the magnitude of the relative velocities are equal (but as it will be shown 
below, their directions are different!). Furthermore, we note that the relative speed 
obtains its maximum value when the denominator in (6.33) takes its maximum 
value, that is, when B, - B, = —f1 - Bo. This happens when 1, 2 move anti-parallel 
in d. 

Relation (6.33) can be written in a different form, useful in the solution of 
problems. 


Exercise 6.4.1 Making use of the 3-vector identity | a x b |=| a |7| b [> —(a- b)? 
show that (6.33) can be written: 


2 _ (Bi — By)” — (Bi x By)” (6.35) 
7 (1-8 - Bo)? 

Up to this point we have determined only the speeds, which are measured in 
the proper frames of the particles in terms of their velocities in X. The following 
question arises: Is it possible to determine completely the velocities v2, v21, which 
are measured in the proper frames of the particles 1 and 2 in terms of the velocities 
V1, V2 which are measured in &? The answer is “yes”, provided that we work as we 
did for the composition of four-vectors. That is, we consider in & the components 
of the four-velocities of particles 1, 2: 


wis ( i) i ( y (vai )e ) 
'™ \vevv/s’ * \v@avva/s, 


and demand the two expressions of vi to be related with a Lorentz transformation 
with velocity v2. The 3-velocities v21, vj2 defined in this way, we call relative 3- 
velocity of particle 2 wrt particle 1 and relative velocity of the particle 1 wrt particle 
2 respectively. In Newtonian Physics we have: 


Vv21 = —V12 = V1 — V2. 
However in Special Relativity nothing is obvious and everything has to be 


calculated explicitly. Let us compute v2;. Using the general result of (6.13) of 
Example 6.3.1, we write: 


y(v2) V+ V2 
{" + | a1 2 1 viva}. (6.36) 


1 
“21 Y (v2) O 


6.4 Relative Four-Vectors 179 


The term in the brackets is written as: 


2 
y“(v2) Vi- V2 
vine i+ | OP a yo] v 


Vi: V2 
=v + [1 (v2) + y7(v |v 
1 are ea y* (v2) + (v2) a |¥ 
1 2 we 2 vi- 
=v v2 + v + v Vv 
POND? Saal ¥)3 y~ (v2) a a 
Qty % 
Y (v2) 5 1 Vvi-V2 
=vi -—v2 v2 
y (v3) +1 vw 
vi: V2 
= v1 — v2 —[y(v2) — 1] ]} 1 Z|? 
2 


where in the last two steps we have used the identity y* — 1 = y*?. Finally: 


1 v-V 
vi —v2—[y() -U J 1-5 }ve}. 6.37) 


y(vs) (1 — 432) v3 


€ 


v21 = 


We note that (6.37) is not symmetric in v,, v2. Indeed, we calculate: 


1 vi - V2 
vi2= V2= vi On =—1)1 x | Vig #—Vi2. 
y(v1) (1 = 432) Z 
(6.38) 
This result is different from the expected Newtonian result vj2 = —v21. This 


must not bother us, because the vectors vj2, V21 are 3-vectors of different relativistic 
observers (the proper observers X1 and X of the particles 1, 2 respectively) and the 
kinematics of relativistic observers is different from that of Newtonian observers.” 


5There is a significant difference, which must be pointed out. The relative (relativistic) velocity v21 
refers to the velocity of the particle 2 in the proper frame X41 of particle 1, therefore involves the 
relativistic measurement of velocity (not photon!) and the inequality |vj2| < c is expected to hold. 
Indeed, if we consider two particles, one moving along the positive x—axis with speed c/2 and 
the other moving along the negative x—axis with speed c/2, then using (6.35) we compute that the 
(relativistic) relative speed of the particles equals 4c/5 < c, whereas the Newtonian relative speed 
isc. 


180 6 Relativistic Kinematics 


The 3-velocities vj2, V21 differ only by a Euclidian rotation, therefore there exists 
an angle y and a direction é normal to the plane defined by the velocities vj2, v2 
such that the rotation of vj2 about the direction é for an angle yy takes it over to v21. 
The angle yw is known as Wigner angle and has wide use in the study of elementary 


2 
particles.° In order to compute the Wigner angle we set A = ie = "ty a 
12 
1 . ; 
nz where yj = y(v;) (i = 1, 2) and get: 
sex vi2°V21 
V12U21 
1 5 (v1 - V2 — V5) (v1 - V2 — Vy) 
= M=¥2) = 2— 1) 3 i =1) - 
A U5 vy 
(V1 + V2)(V1 «V2 — V5) (V1 - V2 — V5) 
+(m-Dm-D = |. (6.39) 
UU, 
We write Vv; - V2 = v1 v2 cos ¢ and after a standard calculation we find: 
2 sin? d 
cosy = | (6.40) 


1+ 2pcos¢+ p2 


1 


where p = [epee | = (1 < ep < o&). Other equivalent forms of (6.40) are: 


y sing 
tage 2sing(p + cos d) (6.42) 


~ 142pcos¢+ p2’ 
We note that the angle w: 


— Is in the region [0, zr] 
— Depends symmetrically on v;, v2. 


Exercise 6.4.2 Prove that for @ = 5 the angle w is given by the relation: 


-—1 
v= tan! | =2cot! p. 


viv, + yrvs 


The interested reader can find more on the Wigner angle in e.g. A. Ben-Menahem (1983) 
“Wigner’s rotation revisited” Am. J. Phys. 53, pp 62-66. 


6.5 The 3-Velocity Space 181 


Assume vj = v2 and show that in the Newtonian limit the angle yw — 0 (is of 
order O(€*)), while in the relativistic limit yy > 5. Finally, show that for photons 


v=. 


Exercise 6.4.3 Working with the relative position, derive relation (6.37). Moreover, 
assume that the position four-vector of a particle 1 in the LCF X% and Xp is 


respectively: 
ois Mao] 


Then consider the Lorentz transformation, which relates r\2 with r, and calculate 
7 ‘ _ dry 2 a 
the relative velocity from the relation V\2 = ae where T2 is the proper time of Xo. 


6.5 The 3-Velocity Space 


In Special Relativity we introduce various spaces besides the standard spacetime. 
Such a space is the 3-velocity space, which is used to study geometrically relative 
motion. The 3-velocity of a particle is described in terms of three components 
therefore in a 3-dimensional space the 3-velocity of a particle corresponds to a point. 
Following this observation, we consider a RIO Xo and two particles, which in Xo 
have velocities’ vjo J = 1,2. Then the velocity of Xo is the origin, O say, of the 
coordinates of the velocity space (the zero velocity 0) and the velocities vjg J = 1, 2 
are two other points, the 1, 2 say, of that space. Joining the points O, 1,2 with a 
“straight line” (note the quotation marks!) we obtain the triangle O12 (Fig. 6.2). 


Fig. 6.2 The hyperbolic 
triangle 


7In this section we follow the notation that two indices in a velocity indicate the first quantity with 
reference to the second. For example the velocity of £1 wrt Xz will be denoted as vj2. Concerning 
the angles we follow the notation that the angle between the velocities vig, V29 of Xy and Xp in 
Xo will be denoted by Aj. 


182 6 Relativistic Kinematics 


In order to give a kinematic interpretation of this triangle we relate the sides of 
the triangle with the speeds of the corresponding velocities. Thus the side 01 we 
relate to the length of the velocity vio and the length of the side 12 we relate with 
the length of the relative velocity v12 (in Xo!.) Because the vy 7; = —v,, the side JJ 
is of the same length as the side JJ with J, J = 0, 1, 2. In Newtonian kinematics 
due to the Newtonian composition rule of 3-velocities (6.26) the velocity triangle is 
a typical triangle of Euclidian geometry hence the velocity space is also a Euclidian 
space. In Special Relativity, due to the relativistic composition formula (6.27), this 
triangle is not Euclidian and the 3-velocity space is not a linear space.® 

In order to find more information about the 3-velocity space, we study the 
velocity triangle. We consider a RIO Xo and let the RIO X%1, Xz with velocities 
Vio, V20 respectively wrt Xo. Then the relative velocity v2; of U2 in Yj (as 
measured by Xo!) is given by (6.38) and its y- factor y23 = y(v21) is given by 
(6.30) or equivalently by (6.32). Furthermore, according to the composition formula 
(6.37) the three 3-velocities vj9, V20, V21 are coplanar (in Xo). Let yzo be the y- 
factor of v7o and ayo J = 1, 2 the corresponding rapidity. Then yyo = cosha21, 
vio = ctanhayge7o, where eyo is the unit along the direction of the 3-velocity 
vio J = 1,2. Let 1 = coshaj2 be the corresponding quantities for the relative 
velocity. Then relation (6.32) gives for y21 : 


cosh a2; = coshajg cosh ayo (1 — tanh ajo tanh a29e19 - e290) 


= cosh aj9 cosha29 — sinh aj9 sinh a29 cos A142 (6.43) 


where Aj2 is the angle between the unit vectors eo), 92 along the 3-velocities 
V10, V20 in Xo. Equation (6.43) is the cosine law of hyperbolic Euclidian Geometry 
for a hyperbolic triangle of sides a@19, @29, @2; and angles Aj2, Agi, Ao2. This 
geometry is identical to the Euclidian spherical trigonometry of a triangle with sides 
ia@19, i@29,ia@2, and angles A12, Ao, Ao2 and we have considered the correspon- 
dence: 


cos(ia) = cosha. (6.44) 
In order to make sure that we have a triangle in hyperbolic Euclidian geometry 
we must prove that the above result holds for all three vertices of the triangle. To do 


that we go to the definition of relative velocity and write: 


i _ wi i i _ i i _ i i 
V2] = V29q — Vig > V9 = Vz] + Vig = Vjg — V2 


8 As we shall show in Sect. 15.4.3, when we study the covariant form of the Lorentz transformation, 
the 3-velocity space is a three dimensional Riemannian manifold of constant negative curvature 
whose metric is Lorentz covariant. Such spaces are known as Lobachevsky spaces. 


6.5 The 3-Velocity Space 183 


But 72] = y12 1.e. cosha}2 = coshaz). It is an easy exercise to show that: 
cosh a29 = cosha12 coshajg — sinhajo sinha}2 cos Ag (6.45) 


where Ao is the angle between the velocities v;2 and vj9 in &. Similarly, we prove 
the corresponding relation for the quantity cosh a19.These three relations establish 
that the 3-velocity triangle is a hyperbolic triangle with sides a19, @29, #21 and 
corresponding angles Aj2, Agi, Ao2 in a three dimensional hyperbolic Euclidian 
space. Note that the sides of this triangle are the rapidity and not the speed of the 
3-velocities. 

A crucial question is whether the relative velocity is compatible with the Lorentz 
transformation. This is not obvious, although it is suggested by the result we have 
just derived. To prove that this is the case, we note that the 3-velocity of Xo in 4 
is —Vjo and in X& is —V29 whereas the 3-velocity of Xz wrt Xj is v2. Therefore 
the definition of the relative velocity we gave will be compatible with the Lorentz 
transformation if the components of the four-velocity of Xo in X 1 and Xp are related 
by a proper Lorentz transformation with velocity v2). 

We choose the coordinate frame in Xo so that the x—axis is along the direction of 
v2-According to the composition formula (6.37) the three 3-velocities v10, V20, V21 
are coplanar (in Xo). We choose the y—axis to be normal to the x—axis in the 
plane of the 3-velocities. We recall that the angle between v2; and v29 is Ao; hence 
between v2; and -va9 (the 3-velocity of Xo in 22) is w — Ao, and similarly the angle 
between v2; and vio is m — Ag. We write for the 4-velocities of Xp in XL; and Xp: 


c coshajo c cosh a9 
j —c sinhajg cos Ago j —c sinhaz9 cos Ag} 
Vo, = : : Vo2 = : . 
a c sinh ajo sin Ag? oe c sinh a9 sin Ag} 
0 LD 0 Xo 


The matrix representing the Lorentz transformation in the coordinate system we 
have chosen is: 


cosha2; sinha; 0 0 
sinha2,; cosha2; 0 0 
0 0 10 
0 0 01 


L(v21) = 


therefore we must have: 


[vigls, = [L(va) Ilvjgls, 


184 6 Relativistic Kinematics 


or: 


cosha29 = coshajg cosha2; — sinh ajo sinha2; cos Ag2 
— sinhaz9 cos Ao, = sinha2; cosha 9 — cosh a2; sinhajg cos Ag2 


sinh Q20 sin Aol = sinh 10 sin Ao2 (6.46) 


The first equation is identical with (6.45) while the other two are identities 
of Euclidian spherical trigonometry if we use imaginary sides ia 1, ia, ia@2,. For 
example, the last relation is the “sine law’. Equivalently, they are the relations 
of hyperbolic trigonometry, that is the trigonometry of the hyperbolic sphere. We 
conclude that the definition of relative motion is compatible with the Lorentz 
transformation. 

In plane Euclidian trigonometry the angles of a triangle add to z. In spherical and 
hyperbolic trigonometry this does not hold and one defines the spherical/hyperbolic 
defect ¢ by the formula 


€ =m — (Aoi + Ao2 + Az). (6.47) 
In spherical trigonometry ¢ is negative and its absolute value equals the area 


of the triangle (on the unit sphere). In hyperbolic trigonometry ¢ is positive. An 
expression of ¢ in terms of the three sides and an angle is the following’: 


sin 


haz — 1)(coshay2 — 1) 7!” 
E [= 020 )(cosh a42 J jk (6.48) 


2 2(coshajo + 1) 


and cyclically for each vertex. It is important to see that the velocity triangle contains 
all the information concerning the relativistic relative 3-velocity, that is, gives both 
the speeds and the relative directions of the velocities. 


6.6 Thomas Precession 


The Thomas precession is a purely relativistic kinematic phenomenon, which is due 
to the properties of the Lorentz transformation, and more specifically, to the non- 
covariant character of Euclidian parallelism. The first to note this property of the 
Lorentz transformation was L. Thomas!° who applied it in the study of the emission 
spectra of certain atoms. According to the simple model of the atom at that time, 
the electrons were considered as negatively charged spheres, which were rotating 
around the axis of their spin like gyroscopes and also around the positive charged 


See for example B.P. Peirce, A short Table of integrals Ginn, Boston 1929 formulae 631 and 632 
and for a theoretical treatment A. Ungar Foundations of Physics (1998) 28, 1283-1321. 


107 H. Thomas (1927), Philosophical Magazine 3, 1. 


6.6 Thomas Precession 185 


nucleus.!! According to the Newtonian theory, the two rotations do not interact and 
after a complete rotation about the nucleus the axis of spin returns to its original 
position. However, the non-covariant character of Newtonian parallelism under the 
action of Lorentz transformation leads to an interaction of these two rotations with 
the effect, that after a complete revolution around the nucleus the axis of spin makes 
an angle with its initial direction. This type of rotational motion is called precession. 
The special case we consider here has been called Thomas precession. 

In order to study the Thomas precession we consider three LCF Xj, X29, &3 such 
that 1, 2 and Xo, U3 have parallel axes and non-collinear relative velocities. Let 
the Lorentz transformations, that relate the pairs 1, Xo and Uz, X3 be Lj12 and L723 
respectively. The composite Lorentz transformation L312 relating Xj, 43 is also 
a Lorentz transformation, because Lorentz transformations form a group. However, 
as we shall show, this transformation is of a different type than Lyj2, L23 and 
corresponds to a Lorentz transformation in which the space axes of the transformed 
1 are not parallel (in the Euclidian sense!) with the space axes of the original 
1. This rotation of the space axes is a purely kinematic relativistic phenomenon, 
without a Newtonian analogue. 

We have seen a first approach to Thomas phenomenon in Example 6.3.1, where 
we studied the composition of boosts along different directions. However, as we 
shall see below, the real power of the Thomas phenomenon is in accelerated 
motions. 

Consider two LCF &1, 2 with parallel space axes and let u be the velocity of 
2X wrt X,. The proper Lorentz transformation L12(u) which relates X&1, X is given 
by the general relations (1.51) and (1.52), or, equivalently in the form of a matrix, 
from (1.47): 


y —YBy 
Li 4, .(u) = 2 (6.49) 
aie: —yp" By + Po BY By 
a! 
where cB” = u = | uv? and cB, = (U1, U2, U3)y,. The inverse Lorentz 
3 
u 
xy 


transformation is Lion j(—) — EF 

We consider a third LCF %3, which has space axes parallel to those of X2 
and velocity v. The Lorentz transformation Lies) j (v) relating Xi and %3 is again 
of the general form (6.49) with u replaced by v. Because the proper Lorentz 
transformations form a group, there must exist a third Lorentz transformation 
Li 13); (w) relating 1, &3 with velocity w, say, defined by the relation: 


Lys); (w) = Linayg W)Lin; (u). (6.50) 


'lMore on the nature of spin we shall discuss later in the chapter on the angular momentum in 
Special Relativity 


186 6 Relativistic Kinematics 


In order to compute the velocity w, we compute the product transformation 
Loa) j (v)L( Dk (u) and then we identify it with a proper Lorentz transformation of 
the general form (6.49). 

We consider an arbitrary spacetime point, whose position four-vector in the LCF 
2X1, U2, 43 has components: 


() + (2) = (8) (6.51) 
Yr DI r2 Do r3 x3 


Then relations (1.51) and (1.52) give: 


L12(u): 
u-r 
b= (4 - =) (6.52) 
Cc 
ul u-r 
r= 01 — Yu — + (Yu — D4 (6.53) 
Cc u 
L93(v) : 
v-ro 
Boy, (2 = ) (6.54) 
Cc 
Vv v-r 
mam ht w—-V 3. (6.55) 


Using (6.52) and (6.53) we have from (6.54): 


u-r, v u u-r, 
Ig = vy [v (1 ) : (nr Yuh + (Yu — 5-4) | 
Cc Cc Cc u 
u-v Yr] u-v 
=m (It+—)u-w—o:[v+nutm—D—u] 56) 


Consider now the LCF %, %3 whose velocities wrt the LCF % are —u and v 
respectively. Then accoding to (6.36) the relative velocity of 3 relative to X] is 


u-v u-Vv 
Vi3 = Vu (1 om —) [y+ Yu + (Yu — Du] 


If we set w = vj13 then (6.56) becomes: 


13 = YuYv (1 oF —) [A ue =] : (6.57) 


Cc 


But we know that (see (6.7)): 


Yo = YuVvQ 


6.6 Thomas Precession 187 


where QO = (1 + uy) . Therefore: 


b= rw (i -—*), (6.58) 
c 

We conclude that, as far as the zeroth component of the position four-vector is 
concerned, the composite Lorentz transformation L;3(w) where w is the relative 
velocity of &3 relative to &1 behaves as a proper Lorentz transformation with 
parallel axes and velocity w. This means that the differences between L13(w) and 
the product transformation Léa j (V) Lys) ,(u) concern the spatial part r3 of position 
four-vector and more specifically either the length or the direction of r3, or both. We 
treat each case separately. 

Let r3(w) be the spatial part of the position four-vector after the action of Liz. j (w) 


(on (: ) ) and let /3(w) be the temporal component. The invariance of the 
Yr) 
1 
Lorentz length of the four-vector implies: 


12(w) — r3(w) = 12 — x3. 
From (6.58) we have /3(w) = /3 hence: 
13(W) = r3. (6.59) 
We conclude that the two 3-vectors r3(w), r3 differ only in their direction (a 


Euclidian rotation), therefore they must be related with an orthogonal Euclidian 
transformation A, say: 


r3(w) = Ar3 (6.60) 
where A'A = 1]3. This Euclidian transformation is the essence of the Thomas 
phenomenon. In order to compute the matrix A, we consider the infinitesimal 
form of the Euclidian transformation. We write v = 6u and have in second order 
approximation O(6u7) : 

1r3(w) = 13 + dQ x r3(w) + O(Su7) 
or 
13(w) — dQ x r3(w) = 13 + O(6u’). (6.61) 
The 3-vector dQ of X3 represents the rotation angle d@(u), which corresponds 


to the Euclidian matrix A. Because the matrix A is the same for all four-vectors in 
Minkowski space, we take the four-vector x’ to be timelike and furthermore we take 


188 6 Relativistic Kinematics 


to be its proper frame. Then rj; = 0 and Lc = —x!x;. From (6.52) and (6.53) we 
find in X> for this four-vector: 


lo = Yul 

u 
Mm =-Vu ae (6.62) 
The 3-vector dQ is an axial 3-vector, !2 which depends on the 3-vectors u, du and 
vanishes when u, du are parallel. Indeed, we observe that in case u//v the composite 


velocity w//u and the spatial axes of ©, X2, &3 are parallel. This means that dQ 
has the general form: 


dQ =au x du (6.63) 


where @ is a function, which has to be determined. In this general form, equa- 
tion (6.55) combined with (6.62) and (6.52) gives: 


éu 
r3 = 12 — You-—la or O(5u’) = 


Yu 
Cc 


(u + du)l; + O(Su’) (6.64) 


where in our approximation ys;, = 1. 
Concerning r3(w), we have from (1.52) for velocity w andr; = 0,(Q = 1+ 


“a: 
Ww 
¥3(W) = —Yw ae 


This equation gives if we set v = du and apply the composition of the y’s 1.e. 
Yo = VuVsuQ: 


1 u-du 
13(W) = —YuYsu Q—>— du + Yuu + (Yu — 1)—5—-u iA 
YuQ-c Uu 


1 u- du 
du+ yu +t (Yu — 1) 7 I. (6.65) 


¢ 
The term!?: 
1 u-du 
dQ x r3(w) = a(u x du) x i) but | Yu + Yu — Ns url; 


& Qe 2 
ane y,u- du—y, (u-du)u+ O (6u~*) | 1,. 


!2These vectors are 1-forms but this is not crucial for our considerations here. 
13 Apply the identity A x (B x C) = (A- CB — (B- AJC. 


6.6 Thomas Precession 189 


Replacing in (6.61) we find: 
2 u-du 
(-1 t+ ayyur + Yy)dU+ | Yu + (Yu — Soe +ay,(u-du)+y, | u=0 > 


udu 
(-1+ ayyu + Yu) (ou “3u) = 


This relation holds for all u, du, therefore: 


_ 
oa, (6.66) 
Yuu 
Replacing @ in (6.63) we find: 
-—1 r) 
re oe Ae eh (6.67) 
Yu u 


This formula gives dQ in terms of 5u, which is velocity relative to Xz. As it turns 
out, in applications we need the velocity dv, relative to £1. In order to calculate dv1, 
we apply the general relation (6.12) for the following velocities: 


u—> —-uUu (X2 > 1) 
v— du (X2 > &3) 
v > dv (21 > &s) 


and have: 


1 u- du 
évj = 6ut+y,u + (Yu — 1)—-4. => 
VuQ u 


1 
u x 6bvj = u X é6u. 


Yu Q 


Then equation (6.67) becomes: 


= 
da = —™—— ou x bv. (6.68) 


u 
In case the velocities u, du<cthe QO~1+ O(6u*) and dQ reduces to: 


ont, 


dQ = 
We 


(u x dv}). (6.69) 


190 6 Relativistic Kinematics 


The rotation dQ in X] takes place with angular velocity w7, which is defined by 
the relation: 
- dQ te 1 


~ dt} ~ u2 


OT uxa (6.70) 


where a = ot is the 3-acceleration of the origin of &3 the moment the velocity of 


X3 wrt Xo is du. 

Although the Thomas precession has a direct geometric explanation, in general 
it is considered as abstruse. This is due to the fact that it is a purely relativistic 
phenomenon incompatible with the Euclidian concept of parallelism, which is a 
strongly empirical concept. 

In order to give a physical meaning to the Thomas precession, we consider the 
simple atom model (by now obsolete!), which is a nucleus with electrons rotating 
around it with constant angular velocity. We assume that the LCF % is the proper 
frame of the nucleus and let t be the proper time of a rotating electron. Then, at 
each point along the trajectory of the electron, the position 3-vector rj (7) and the 3- 
velocity are determined by the proper time t. The relation between the proper time 
t and the time f; in & (the nucleus) is: 


ty = y(u)t. 


Consider the neighboring points along the trajectory of the electron with position 
vectors r}(t) and r3(t +67). At the point rj (t) we consider the LCF &2 with space 
axes parallel to the axes of X; and at the point r3(t + tT) we consider the LCF 
%3 with space axes parallel to the axes of Xz. Due to the Thomas phenomenon, the 
space axes of 3 will appear to rotate in Xj with angular velocity wr. The angle of 
rotation in time dt, is wrdt,. Because the velocity is normal to the acceleration a, 
the a = x, where p is the radius of the atom in & 1 (proper frame of the nucleus), 
we have: 

(4 = 1) a 


Ore un = —(y, — =n (6.71) 
up p 


where n is the unit normal to the plane u x a (in which lies the orbit of the electron). 
Obviously, due to the small value of the quantity y,, — 1 for usual speeds, the angular 
velocity wr of precession is very small. If we set w = 4, where is the angular 
speed of the electron around the nucleus, we find that in 2: 


er = —(¥, — Don. (6.72) 


Taking the coordinates so that the z—axis is along the direction of n, we have that the 
x — y plane is rotating counter-clockwise with angular velocity wr = (y, — l)wz. 
For every complete rotation of the electron, the precession angle is 27 (y, — 1) rad. 


Exercise 6.6.1 Assuming that the period or rotation of the electron (in X4) is T, 
prove that the period of the Thomas precession is Tr = —, where u is the 
(constant) speed of rotation of the electron. Show that for usual speeds Tr > T. 


Chapter 7 ®) 
Four-Acceleration om 


7.1 Introduction 


Although the study of accelerated motion is necessary in Special Relativity, as a 
tule, in standard textbooks little attention is payed to this subject. Perhaps this is 
due to the difficulty of the comprehension of “acceleration” in spacetime and its 
involved “behavior” under the Lorentz transformation. Indeed, with the exception 
of the proper frame, the zeroth component of the acceleration four-vector enters 
in the spatial part creating confusion. Furthermore the Lorentz transformation of 
the acceleration four-vector does not reveal a clear kinematic role for the temporal 
and the spatial parts. However the extensive study of four-acceleration is necessary, 
because it completes our understanding of relativistic kinematics, relates kinematics 
with the dynamics and finally it takes Special Relativity over to General Relativity 
in a natural way. In addition the four-acceleration finds application in many physical 
phenomena, such as, the radiation of an accelerated charge, the annihilation of anti- 
proton, the resonances of strange particles etc. 

The structure of this chapter is as follows. In the first sections we study 
the general properties of four-acceleration and its behavior under the Lorentz 
transformation, especially under boosts. We prove a number of general results which 
are applied to one dimensional accelerated motion and in particular to hyperbolic 
motion, which is a covariant relativistic motion. We consider two types of hyperbolic 
motion: the case of a rigid rod and the case of a “fluid”, the latter being characterized 
by the fact that the relative velocity between its parts (particles) does not vanish. 

The most important concept in the study of accelerated motion is the synchro- 
nization of the clocks of the inertial and the accelerated observer. As we have 
seen the clocks of two inertial observers are synchronized with chronometry (Ein- 
stein synchronization), whereas we have not defined a synchronization procedure 
between an inertial and an accelerated observer. The definition of a synchronization 
for accelerated observers is necessary if we want to compare the kinematics of 


© Springer Nature Switzerland AG 2019 191 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_7 


192 7 Four-Acceleration 


such observers. We emphasize that there is not a global synchronization for all 
accelerated observes and one defines a synchronization per case. 

We shall also discuss the extension of the Lorentz transformation between 
an inertial and an accelerated observer and shall introduce the concept of the 
“generalized” Lorentz transformation. The generalized Lorentz transformation will 
lead us to examine the incorporation of gravity within Special Relativity. The result 
will be negative as will be shown by means of three thought experiments, the 
gravitational redshift, the gravitational time dilatation and finally the curvature of 
spacetime, which leads directly to General Relativity.! 


7.2 The Four-Acceleration 


Consider a relativistic mass particle (ReMaP) P (not a photon!) with position four- 


vector x! and four-velocity u! = ae The four-acceleration a‘ of P is defined as 
follows [see equation (6.2)]: 


,_ du! 


= — (7.1) 


and it is a spacelike vector normal to the four-velocity vector (because u'a; = 
0). Consider an arbitrary RIO &, say, in which the four-vectors x'and u'’ have 


components x! = ({) and u! = 64) respectively where y = (1 — B7)~!/2 
=e? PN? Ss 


and / = ct. Then in & the components of the four-acceleration are: 
. at du! } 
i 2 ay ( i 4 ) (7.2) 
= 


where a dot over a symbol indicates derivative wrt t.e.g. y = a We set V = a, the 
3-acceleration of P in & and (7.2) is written as follows: 


i cy 
= . : 73 
deg san a 


One computes the quantity y most conveniently by means of the orthogonality 
condition u'a; = 0. Indeed we have: 


: . ; 1 
(Wy +ya)-v—c'y =0 Sy = Gy: a). (7.4) 


Replacing in (7.3) we obtain the final expression: 


0 

i ca 
= 7.5 
° (ae va), Me) 


'This chapter has been severely improved due to the comments of Roger Berlind whom I sincerely 
thank for his assistance. 


7.2 The Four-Acceleration 193 


where: 


Papp 2s iy (7.6) 
C2 (c? _ v2) 


We note that both the temporal part and the spatial part of the four-vector a! 
depend on both v and a. Therefore it is not obvious why we should consider the four- 
vector a‘ as the “generalization” of acceleration of Newtonian Physics in Special 
Relativity. In addition we note that the temporal part a° enters the spatial part. These 
observations indicate that we should be careful and search for proper answers. 

We recall (see Sect.6.2) that the proper frame X* (say) of P is also a 
characteristic frame of the four-vector a! and that in ©* a! has the form: 


;_ [0 
a =(a) (7.7) 


where at is the 3-acceleration of P in its proper frame. But how can we conceive 
kinematically the statement “J am accelerating in my own proper frame?’. The 
answer is as follows. 

We consider two relativistic mass particles, P, Q say, which at some moment t 
in the frame of a RIO & have relative velocity vpg(t) = 0 and relative position 
vector rpg (t). Furthermore we consider that P has constant velocity in & whereas 
Q is accelerating in &. Then the proper frame a of P is a RIO (related to & 
with a Lorentz transformation) whereas the proper frame > of Q is not a RIO and 


it is not related with a Lorentz transformation either with & or with Eh. This is 
expected because the Lorentz transformation relates two RIO and not a RIO with an 
accelerated observer. There are two options to continue: 


e Either find a way to interpret the accelerated motion in terms of inertial motions 
or 

e Extend by means of some new definitions and principles the Lorentz transforma- 
tion to relate a RIO with an accelerated observer. In this generalization one must 
demand that in the limit of vanishing acceleration the transformation will reduce 
to the standard Lorentz transformation. 


In this section we consider the first option and leave the second for subsequent 
sections. 

At the moment ¢ + dt of & the relative position of P, Q in & is rpg(t + dt) 
different from rp g(t) and to be specific: 


rpo(t +dt)= rpg(t) + vpeolt + dt)dt. 
where the relative velocity vpg(t + dt) of P, O at the moment tf + dt of & is: 


vpg(t + dt) = apo(t)dt 


194 7 Four-Acceleration 


where ap g(t) is the relative acceleration of Q in ae Because D3 isa RIO apg (t) 
equals the acceleration of Q in &, therefore it is different from zero. From this 
analysis we infer the following: 


1. At the moment ¢ of & the proper frames =p and x5 coincide and ee is a 


RIO. We call Dyes the Local Relativistic Inertial Observer (LRIO) of Q at the 
moment f of &. At the moment tf + dt of & the oR and x5 do not coincide and 
there is another LRIO for Q whose velocity in & is vpg(t + dt). This means 
that we can interpret kinematically the acceleration ap g(t) in X as a continuous 
change of LRIO. We conclude that the relativistic accelerated motion can be 
understood as a continuous sequence of (relativistic) inertial motions of varying 
velocity, or, equivalently as a continuous sequence of Lorentz transformations, 
parameterized by the proper time of Q. 

2. Geometrically the above interpretation can be understood as follows. The world- 
line of an accelerated observer is a timelike differentiable curve in Minkowski 
space. This curve can be approximated by a great number of small straight 
line segments, as is done, for example, in the computation of the length of a 
curve. Each such segment can be seen as a portion of a straight line, which 
subsequently is identified with the worldline of a RIO. In the limit the worldline 
of an accelerated observer is approximated by the continuous sequence of its 
tangents, parameterized by the proper time of the accelerated observer. 


By means of the above approach we achieve two goals: 


e We explain the accelerated motion in terms of inertial motions. 

¢ It is possible to apply the Lorentz transformation along the worldline of the 
accelerated observer provided that at each event we change the LRIO or the 
transformation. 


Exercise 7.2.1 Show that if the 3-acceleration of a ReMaP vanishes in a RIO, then 
it vanishes for all RIO. This implies that the accelerated motion is covariant in 
Special Relativity. That is, if a relativistic observer is accelerating wrt a RIO then 
it is accelerating wrt any other RIO. This result differentiates the inertial motion 
from the accelerated motion and in fact it is the expression of Newton’s First Law in 
Special Relativity. 


As is the case with all relativistic physical quantities, the four-acceleration must 
be associated with a Newtonian physical quantity, or, must be postulated as a pure 
relativistic physical quantity. This is done as follows. In the proper frame ©* of 
the position vector of an accelerated ReMaP (=relativistic mass point) P the four- 
acceleration has the reduced form (0, a) 5+ i.e. it is specified completely by a 3- 
vector at. We postulate that the 3-acceleration which is measured by the LRIO ©, 
of =* at the proper time moment t of P, coincides with the Newtonian acceleration 
of P as measured by &,. We assume that this is the acceleration that the proper 
observer of P “feels” or measures (for example with a gravitometer). We note the 
following result, which is consistent with this identification. 


7.2 The Four-Acceleration 195 


Exercise 7.2.2 Show that a+ = 0 if and only if a'a; = 0 (See Exercise 7.2.1) 


We note that: 
aia; = (at)* =a’ (7.8) 


that is (a+)? is an invariant. This means that if a RIO measures the quantity 
(a+)? then this number is the same for all other RIO. However the vector at 
itself is differentiated in a complex way, and this is what makes the study of four- 
acceleration difficult. 


Example 7.2.1 Calculate the Lorentz length a* of the four-acceleration a! in a RIO 
2 in which the velocity of a ReMaP is v and its 3-acceleration a. Show that if in 
& the 3-acceleration a is parallel to the 3-velocity v then: 


a 2 
a’ = (“2”) c (7.9) 
dt Jy»s 
whereas if it is normal to the 3-velocity then: 
a’ = y*a’. 
es 7 é . . 2 vvy? 
Hint: Use the identity y = B- By’ = a [see (7.4)]. 
Solution 
. cy 
Intad=y : (see (7.3) ). Hence: 
wy tya/ys 


eo Saag Say (-7? + (VB + vB)*) = 
= cy? (77 + 7°? + vB + 28 - Byy) = 
= cy? (=y7(1 — B?) +B + 286 cosy?) = 


(- 


From the hint we have y = Bf cos gy. Replacing we find: 


y? 
aja. 


7 + yp + 2BB c0s77) . 


a= cy? (6 cos? byt + y2B? + 24826? cos? 2) - 


a? = c*y* Bp? (1 + y”B cos” ) : (7.10) 


?Show that this result can be written in the equivalent form: 


at? = [a? — (ax B)*y°). 


196 7 Four-Acceleration 


This is the general case. In case £ || B (that is v || a) cos@ = 1 and the above 
formula reduces to 


2 2 
ae = cy 4p (1 ns y>B?) = 02 6B” = y%a?. 
This can be written differently. We have: 
(By) = By + BY = By + BBY? = By(1 + B*y”) = By? = By’ 
hence: 
2 2 72 72 
a =c [yy } =[ov]. (7.11) 
In case B L B, cos @ = 0 and we get: 
a = yp’ = y4a?. (7.12) 
It becomes clear that the concept of four-acceleration is much more complicated 
and involved than that of the four-velocity. This should be expected because, unlike 


velocity, acceleration relates kinematics with dynamics. We continue with some 
useful examples. 


Example 7.2.2 A ReMaP P in the RIO & has 3-acceleration a and 3-velocity v. 
Calculate the 3-acceleration of P in the RIO X’ which moves wrt in the standard 
configuration with speed u along the common axis x, x’. 

Solution 

The four-acceleration in & is given by: 


i ca® 
4=\00 2 
avryalss 
0 


where a9 = yyy = yic77(v - a). Suppose that in X’ P has velocity v’ and 


acceleration a’, so that: 
/ 
i ca° 
a= 0. 2a/ 
avt+y,;a y 


where a” = Vy Vy! = vac ?(v'a’). 
The two expressions are related with a boost: 


ca® = Yu [ca® - ¥ ay + yea")] 
c 


ane: , Uu 
oo ya = y,(av* + yea’ ca"e) 
c 


7.2 The Four-Acceleration 197 

/ if af . 

ay? + ya =a°v’ +y2a? 

/ z! z! Zz Zz 

a? ve + a — ave + wea. 

The first equation gives (we write v* = v, and a* = a, why?): 
xX 
0 uv u 9: 5 
a° =y,a° (1 = “) —sMyjpa. (7.13) 
c c 
Replacing in the second follows: 


0 2 2 
Yua ( = us) Vy! — Ux! VuYy Ax + Vy Ax! = 
= Yu (a? vx + Ves = ua°) > 


) ax + ayy [ux U — Vx! (1 we), 


But from the composition rule of 3-velocities we have vy, = =~“ therefore the 


u 


VU 
9) 
C2 


ig = hy, (1 + 


second term in the rhs vanishes. Hence: 


2 / 
ay = (14 SE) ay, (7.14) 
Vy C 
But from (6.7) we have: 
Uvy! y Uv! 
w= nw (1+ =F) > Fay (14+). (7.15) 
Cc Yo! c 
Replacing we find: 
3 
ay = uv 
Yo 
This relation can be written in the form: 
Yody = Vi ax (7.16) 


which indicates that the quantity vids is an invariant under boosts. This is a very 
useful result in applications. 
Equation (7.14) can be written differently. From equation (7.15) follows: 


uv y, 1 
Yo! = Yuu (1 *) > ee 
Cc 


198 7 Four-Acceleration 


and (7.14) becomes: 
1 


3 Uy 
va (1— 3) 


dx. (7.17) 


ay! = 


Working similarly we show that: 


1 uvy 
pe 
ay! = oe aro, (0 + Sap] (7.18) 
Vie (1 ~~ ua) e 


uv: 
¥) 
ay = —— (. oF Sx) (7.19) 
v2 (1- ) e 


It is worth discussing the kinematic implications of the transformation relations 
(7.17), (7.18), and (7.19). They imply that the 3-acceleration a’ in X’ depends on 
both the 3-acceleration a and the 3-velocity v of P in &. Therefore, in general, a 
motion along the x—axis with constant acceleration a = a,i does not imply that in 
&’a‘,, a’, vanish unless vy = v, = 0! This means that in &’ the motion is in general 
3-dimensional and furthermore it is not with uniform acceleration. 

Let us examine the important case of planar motion. It is easy to show that if 
the motion in & is accelerated in the y — z plane (in general a plane normal to the 
direction of the relative velocity of &, X’) then the motion in &’ is also planar. In 
all other cases the motion in &’ is not planar. This means that the concept of planar 
motion in Special Relativity is not covariant, hence it has no physical meaning in 
that theory. An important motion of this type of great interest in Physics is the central 
motion, which we examine in the following exercise. 


Exercise 7.2.3 A RIO X’ is moving wrt the RIO & in the standard configuration 
with speed u along the common x, x' axis. Show that a central motion in the plane 
y, Z of & is also a central motion in the plane y’, z' of =’. 


For a general relative velocity u of &, X’ relations (7.17), (7.18), and (7.19) read 
as follows: 


1 . -a -a/fl 
/-——_, |(1 <> )at = v+o5 ( i) (7.20) 
2 uv c c Uu Yu 
ve (I~) 


where a, a’ is the 3-acceleration of the ReMaP P in ©, D’ respectively and v is the 
velocity of the ReMaP in &. From this relation we find that the parallel and the 
normal components of a’ wrt u are: 


1 


he 
aaa ae 
ve (1- @) 


a (7.21) 


7.3 Calculating Accelerated Motions 199 


1 
a = ; E zu x (ax | (7.22) 
ve (1 ~ ur) © 
& 


a 7 al (7.23) 
ve (1 @) 
The proof of (7.20), (7.21), (7.22), and (7.23) is tedious but standard. 


Exercise 7.2.4 Consider two LCF &, &' with parallel axes and relative velocity 
u. A ReMaP P in X, &' has 3-velocity v, v’ and 3-acceleration a, a! respectively. 
Show that the proper 3-acceleration of P satisfies the relation: 


at = yjay t+ ypar = yyay t+ yyal (7.24) 


where the parallel and the normal analysis refers to the direction of the relative 
velocity u of X, &'. Using this result prove that: 


1. A ReMaP moves along a straight line in both X and =’ if and only if it is moving 
along the direction of the relative velocity u of X, X'. If this is the case then the 
accelerations satisfy the relation: 


Y; ay = yay = ar. (7.25) 


2. A ReMaP moves in a planar motion in both RIO X, X’ if, and only if, this 
plane is normal to the relative velocity u of X, &'. Furthermore in that case 
the accelerations satisfy the relation: 

year =ylal =al. (7.26) 
[Hint: Assume that one of the RIO, the X say, coincides with the proper frame 


of P. Then u=v,v' = 0 and equations (7.21), (7.23) give ay = Vy aj al = 


i all: 


7.3 Calculating Accelerated Motions 


When we are given an accelerated motion in a RIO ©’ and wish to describe the 
motion in another RIO = which is related to X’ with the Lorentz transformation 
L(x’, X) there are two methods to work: 


e rst Method: Use the Lorentz transformation to calculate the acceleration in & 


: ° 3 Dat Aare 
and then solve the differential equation - 25 = a' in &. Because the transfor- 


200 7 Four-Acceleration 


mation of the four-acceleration involves the 3-velocity we have to transform the 
four-velocity too. 

e 2nd Method: Solve the quadratic equation ax = a’ in &’, that is compute the 
3-orbit r’(t’) and then use the Lorentz transformation to find the orbit in X. 


As a rule the second method is simpler because we choose &’ so that the 
equations of motion are simpler. 


Example 7.3.1 The LCF © and ©’ move in the standard configuration with speed u 
along the common axis x, x’. A ReMaP P departs the origin at rest the time moment 


t' (of &’) and moves along the x’—axis with constant acceleration (in &'!) a = €, 


where T is the life time of P (a constant). If u = c/ ./2, show that the motion in © 
is described by the equation: 


x= 2/2c(t +1)x +2c7t(t+c) =0. 


bh 


For t < t show that the accepted solution is x = S=. Comment on the result. 


a 


Solution 
The position four-vector of P in © and D’ is: 


i (“') (") 
x= 1 . 
X/> x /y) 


It is easy to calculate that in X’ the trajectory of P is described by the equation: 


_ te 
Or 


i 
x 


#7 (7.27) 


To calculate the equation of motion of P in X we apply the boost along the x, x’— 
axis with speed u: 


1 


x=y(u) (52 zs ut’) (7.28) 


ct! = y(u) (ct _ =x) : 


Replacing ¢’ in (7.28) we find 


car} (— Sf afro) 


Cc 


For u = va we compute y(u) = V2. It follows that the motion of P in ¥ is 
described by the equation: 


x? —2V/2c(t + r)x + 2c7t(t +t) =0. 


7.3 Calculating Accelerated Motions 201 


The last relation can be written: 
2 t 2 t 
x? —Wadcr (1+ —)x+2c?rt(1+—] =0. 
T T 


If : < 1 this becomes: 


x? — 2W2crx + 2 tt =O = x = V2tc Ht V 20272 — 2¢2 tt 


therefore: 


r= Vire(14 ft 2) ~ vaer[14 (1-£)]= - 


2/2ct (1 _ rt) : 


The first solution is selected if the initial condition is x(0) = O and the 
second solution if the initial condition is x(0) = 2/2cr. According to Lorentz 
transformation when t = t' = 0 the x = x’ = 0, which implies that the accepted 
solution is the x = 4. 


V2 

Example 7.3.2 (Hyperbolic and harmonic oscillator) A ReMaP P is moving with 
constant speed wu along the x’—axis of a LCF &’. When P passes through the origin 
of &’ it begins to accelerate with acceleration a’ = k*x’ where k is a positive 
constant. Let © be another LCF which is related to X’ with a boost along the x—axis 
with velocity factor 8. Compute the position, the velocity and the acceleration of 
P in X. Repeat the calculations for a’ = —k?x'. 

Solution 

The equation of motion of P in D’ is a = k?*x' with the initial condition 
x'(0) = 0, x’(0) = u. Therefore in &’ the orbit is: 


ru u. VE 
x(t) = k sinh kt’. (7.29) 
In order to find the orbit in = we apply the boost relating ©, X’. We have: 


t=y (: _ er) (7.30) 
c 


x = y(x' + Bet’). (7.31) 


From (7.29) and (7.30) follows: 


fad Gar) 
x = —sinh|ky|(t——x)]. (7.32) 
k Cc 


202 7 Four-Acceleration 


Replacing x’, t’ from (7.30) and (7.32) in (7.31) we compute the required orbit: 


[er (Be) 6-8] 


7 sinh E (: = Px) + Bey’t = B-y?x => 
c 


& 
Il 


& 
Il 


(1+ B2y2)x — Bey2t = nm duh E (: 7 Ps)) 
Cc 
But 1+ B?y? = y? hence: 


ats ir (12) 
x — Bct = — sinh} ky |{t-— —x]}. (7.33) 
ky c 


Differentiating (7.33) wrt t we find: 


v—pe= : ky (1 *e) cosh E (: — Px) => 
ky c c 


v—Bce=u (: — a) cosh @ where ¢ = ky (:- x). 
c c 


Solving in terms of v we find the speed of P in Z: 


__ ucosh¢ + Be 


= (7.34) 
1+ = cosh @ 


and by differentiation the acceleration: 


ik sinh ¢ (1 a bv) ee 


Re 
y (1 + “B cosh $) 


Working similarly for the acceleration a’ = —k?x’ (harmonic oscillator in ©’ not in 
x!) we find: 


V(t) = 7 sink’ (7.36) 


7.4 Hyperbolic Motion of a Relativistic Mass Particle 203 


and 


: B 
ui, ucos@+ Be ku sing (1 2) 
x= — sing + Bet, aT re 2° 
Y 1+--cosd Y (1+ 2 cos¢ 


7.4 Hyperbolic Motion of a Relativistic Mass Particle 


In the last section we have shown that the motion of a ReMaP along the straight line 
defined by the direction of the proper acceleration at is covariant only in the LCF 
whose velocity is parallel to at. Although this type of motion is very special, it is 
useful because it gives us the possibility to study the kinematics of four-acceleration 
in rather simple problems. More interesting is the case at = constant. This type of 
motion we name hyperbolic motion and discuss it in Example 7.4.1. 


Example 7.4.1 A ReMaP P moves along the x—axis of the LCF © with velocity 
u and constant proper acceleration at. Let t be the proper time of P. Ignore the 
superfluous coordinates y, z. 


1. Calculate the four-acceleration of P in &. 

2. Calculate the 3-acceleration a = a of Pin &. 

3. If w is the rapidity of P in & show that ay = a. Calculate the four-acceleration 
in terms of the rapidity y. 

4. Give the component of the four-velocity in & for a general proper acceleration 
at (t). 

5. In case at =constant compute the four-velocity in © assuming the initial 
condition: For t = 0, t = 0,x(0) = xo and u = O (departure from rest in 
=). In addition show that in & the following statements are valid: 


— The f-factor of P in } is: 6 = tanh = 
+ 
— The y-factor of P in © is: y = cosh *. 
The position of P in & in terms of the proper time T is: 


2 + 
x(t) = — (cost ata !) 
a Cc 


— The time of P in & in terms of the proper time T is: 


c .. att 
t(t) = — sinh —. 
at c 


204 7  Four-Acceleration 


— The position of P in terms of the time ¢ in & is: 


2 qt2z2 


40=% 241s 


—1 
at c2 


It is given that y = BBy? (see (7.4). 
Solution 


1. In the proper frame xp of P at the proper time t, the four-acceleration is: 


“= (%) 
at 7 


where, for clarity we have ignored the y, z coordinates. Suppose that in & the 
four-acceleration is: 


a! = ya*, ca® = Byat (7.38) 
hence: 
: a 
gah) 2 (7.39) 
ya" Js 


2nd solution 
The orthogonality relation u’a; = 0 between the four-velocity and the four- 
acceleration gives: 


oO = Xx 1 
ca°u® — ay! =o => (3 : ) (7.40) 
a =dhu 
(u° > 0,A > 0 assuming motion towards the positive x—axis). The Lorentz 
length of the four-acceleration is at hence: 


+ 
(at)? = (a!)? — (cay? = —22f(u!)? — (v)?] = 22 A= ule 
c 


3This boost relates © with the instantaneous inertial observer of P at a given event along the world 
line of P. The latter is different at every point of the world line of the accelerated point P. 


7.4 Hyperbolic Motion of a Relativistic Mass Particle 205 


From these relations and the components of the four-velocity u! = yu, u° = yc 


we obtain again (7.38) and (7.39). 
2. We know that in © the four-acceleration a’ has components (see (7.5)): 


(it) 
= 2 d 
agu+y ass 


Comparing with (7.39) we find the equations: 


Bya* = cag i 
=-—ya'. TAL 
eens =a a (7.41) 


2nd solution to the second question 
As we have shown in Example 7.3.1 [see relation (7.16)] the quantity a, y is 
an invariant. In the proper frame y+ = 1 hence ayy? = at ete. 
3. The rapidity of P in & is defined by the relation y = cosh w. Differentiating wrt 
t we find y = sinh wy. Replacing y from py = BAy? and sinh yw = By we find: 


1 u. . ay? dtd a 
ys ya a 
c Cc Cc 


dt dt ct —# 
dw a4 
dtc - 
But from question 2. we have ay? = at, hence: 
d at att 
OE pe (7.42) 
dt Cc Cc 


We infer the (important) result that the quantity w is an invariant. Concerning the 
expression of the four-acceleration in terms of the rapidity we have’: 


0 att 
ca° = Bya* =a" sinh = a™ sinh — (7.43) 
c 
+ 
a! =yat at wosh (7.44) 
Cc 


(Notice the difference between a, and at). 


4We consider at =constant however the solution remains the same if a+ (rt). 


206 


4. Calculation of the four-velocity 


7  Four-Acceleration 


From (7.40) we find: 
1 + 
u° = —a! = —at cosh a = c cosh —— 
Xr + Cc 
1 a 
yo oe = c at sinh — 7 = c sinh —— 
Xr at c 


(7.45) 


Another way to calculate the components of the four-velocity is the following. 
From the definition of the four-velocity and relations (7.43) and (7.44) we have: 


+ 
: at a 
= f cade = fat sinh ——dt = ccosh 
Cc 


tr 
a 4A 
c 


: bases iota : 
where A is a constant. Similarly we compute u! = c sinh “+ + B where B isa 
constant. Using the initial conditions we compute A = B = 0 and the previous 


result follows. 


In order to calculate the motion of P in & in terms of the proper time tT we 


consider the definition of u°, u! and write: 


dt wu? att 
— = — =cosh —— 
dt Cc Cc 
dx i att 
— =u =c sinh — 
dt 


Integrating we find: 


cl, att 
t = — sinh { ——]+ A) 
a (6 


2 att 
x = — cosh —— + B, 
at Cc 


(7.46) 


(7.47) 


where Aj, B, are constants. Assuming the initial condition (0) = 0, x(0) = xo 


we compute A; = 0, By = ae + xo. Finally: 


a ee att 
t= & sinh (=) 


2 + 
x — X09 = 7(cosh — — 1). 


(7.48) 


7.4 Hyperbolic Motion of a Relativistic Mass Particle 207 


In order to calculate the motion of P in & in terms of the time f in X we use 
the transformation of the time component: 


t t t t 
t= <6 = “i (7.49) 


Y cosh @t ~ 1 ainh? 2 
+ sinh* 4+ +t 
e Fi c i + (<*) 


Similarly for the position we have from (7.48): 


7.4.1 Geometric Representation of Hyperbolic Motion 


It is useful to discuss the geometric representation of hyperbolic motion in the 
Euclidian plane (ct, x). From (7.48) we have for the worldline of P: 


C2 ? 
(x- 10+ =) seC=aq (7.51) 
a 


where q = a Equation (7.51) is the equation of the worldline of the ReMaP P. 
The graphic representation of (7.51) on the Euclidean plane x — ct is a hyperbola 
2 


(hence the name of this type of motion) with asymptotes x — x9 + iF = ct. The 
asymptotes are null spacetime curves which can be considered as the worldlines of 
photons emitted from the point x = ae + xo at the moment tf = 0 of X. The 
motion is a “rotation” (along a hyperbolic circle) of the line connecting the point P 
with the pivot point x = a + xo. The “radius” of this rotation is g = _ The 
above are represented in Fig. 7.1. We note that the label g on the vector in Fig. 7.1 
is measuring the Lorentz or hyperbolic length which coincides with the Euclidean 
length only when t = 0. 

Exercise 7.4.1 Based on the above kinematic interpretation of the asymptotes x + 


ie = -tct show that if a ReMaP P which rests at the origin x = 0 of & departs 


+ 


at the moment t = 0 in & along the x—axis with constant proper acceleration a 

2 

while at the same moment (t = 0) of & a photon is emitted from the point 5 as 
of the x—axis towards P, the photon will never reach P. 

The geometric representation of the worldline becomes more explanatory if we 


introduce the new variable X = x — x9 + - Then the equation of the worldline 
reads (see Fig. 7.2): 


acer ey. (7.52) 


208 


Fig. 7.1 Geometric 
representation of hyperbolic 
motion 


7 Four-Acceleration 


Light cone 


at we 
Fig. 7.2, Geometric 
representation of hyperbolic 
motion in coordinates (X, ct) 
Light cone 
at O 


In the new coordinates the pivot point is located at the value Xo = O and the 
parameter gq is the radius of hyperbolic rotation (see Fig. 7.2). 
If we represent the worldline in the complex plane (X, ict) we obtain a circle of 


radius q (Fig. 7.3). 


Exercise 7.4.2 Show that in the complex plane the asymptotes X = ct pass 
through the origin O and the worldline becomes a circle of radius q. If @(A, B) 
is the angle between two rays OA, OB in the complex plane, then: 


arc(AB) = q@(A, B). (7.53) 


7.4 Hyperbolic Motion of a Relativistic Mass Particle 209 


Fig. 7.3 Geometric 
representation of hyperbolic 
motion in the complex plane 


The angle $(A, B) equals the rapidity between the LRIO with worldlines tangent to 
the events A, B. 


Exercise 7.4.3 Consider the expressions (Ww € R, q =constant): 


X =qcoshy (7.54) 

ct = qsinhy (7.55) 
and show that they define a parametric expression of the worldline X* — c?t* = q? 
with parameter yy. Also show that the Lorentz length ds along the worldline is given 
by the expression: 


ds =qdwv (7.56) 


and it is reduced to the equation (7.53). 


Exercise 7.4.4 Show that the parameter w of Exercise 7.4.3 satisfies the relations: 


w = tanh”! B (7.57) 
y =coshy (7.58) 
d i 
= = = (7.59) 
a't 


where t is the proper time of P and the initial condition is t = 0 when w = 0. 


The hyperbolic motion is a covariant motion, in the sense that if a ReMaP P 
moves in a LCF with hyperbolic motion then its motion in any other LCF is also 
hyperbolic. Indeed we observe that for hyperbolic motion in U: 


Bere + E + +\ 2 
a sinh =¢ (7) ae (<) xi 
a* cosh © . cosh y / ,, q c 


. 

ll 
—, 
Qg 9 

- Q 
° 
ee 

ll 


210 7  Four-Acceleration 


where X! is the position four-vector in the plane (ict, X) (see Fig. 7.3). Furthermore 
from the orthogonality condition a'u; = 0 we find: 


u'X; =0. (7.62) 


Relations (7.61) and (7.62) constitute the complete covariant characterization 
of hyperbolic motion. They may be considered as defining the uniform rotational 
motion in Special Relativity, because equation (7.62) shows that the position four- 
vector (the “radius” of rotation of the cyclic motion) is normal to the tangent of the 
spacetime orbit, which is the four-velocity u'. 


7.5 Synchronization 


The concept of synchronization is a key concept in the understanding of Special 
Relativity. However it appears that there does not exist a clear exposition of it in the 
literature. In the case of RIO the synchronization of the clocks is done by means of 
light signals in the well known way (Einstein synchronization) but little is said for 
the synchronization of accelerated relativistic observers. In the subsequent sections 
we shall deal with this concept, always within the limits set by the level of this book. 

In Special (and General) Relativity there are two “times”, the coordinate time and 
the proper time. The first corresponds to the zeroth coordinate of a specific event 
measured by a relativistic observer (inertial or not!) by the method of chronometry. 
The second is the indication of the personal clock read (no measurement procedure 
is used!) by the same observer. The two concepts are also different mathematically. 
The first is a coordinate and the second is a Lorentz invariant. The Lorentz 
transformation relates the coordinate time, not the proper time. 

Suppose we have a set of observers describing the various events in spacetime. 
Because in Special Relativity there is no universal (i.e. absolute) time, they are 
not able to relate their kinematical observations; they can only exchange their 
measurements by the appropriate Lorentz transformation — and that only in the 
case they are RIO — and the Lorentz transformation by itself does not produce 
information. For an intrinsic description of kinematics we must define a corre- 
spondence between the proper clocks of the relativistic observers (inertial or not!). 
Every such correspondence we call a synchronization. For a synchronization to be 
“satisfactory” we demand that it complies with the following requirements: 


¢ It must be independent of the coordinate system employed by the observers. 

¢ Incase the observers are RIO it must be symmetric (i.e. observer independent in 
order to preserve the equivalence of the observers) and Lorentz covariant. 

e It must define a 1:1 correspondence between the world lines of the observers (that 
is at each proper moment of one observer must correspond one proper moment 
of the other and the opposite). 


7.5 Synchronization 211 


Fig. 7.4 Einstein 
synchronization of RIO 2! 


In Newtonian Physics the synchronization is unique (i.e. absolute) and it is 
defined by the identification of the “proper time” of all Newtonian observers (inertial 
or not) with the absolute clock. This is the reason we do not consider explicitly the 
concept of synchronization in Newtonian Physics. 

It is evident that there are infinite ways to define a synchronization, the most 
important and with a physical significance being those which are closest to 
the concept of Newtonian time. In the following we consider first the standard 
synchronization between two RIOs (Einstein synchronization) and subsequently a 
synchronization between a RIO and an accelerated observer. 


7.5.1 Einstein Synchronization 


It is natural to expect that the most useful and natural synchronization between RIO 
must be defined in terms of light signals. This synchronization is the one considered 
initially by Einstein and for this reason it is known as Einstein synchronization. 
This synchronization is defined as follows (Fig. 7.4). Consider the worldlines of two 
RIO ©, =’, which are straight lines. At the event | along the worldline of the RIO © 
consider the light cone with apex at that event. This cone is unique and furthermore 
intersects the worldline of the RIO ©’ at the point 1’ say. This procedure can be 
done at every point along the world line of & and it is easy to show that it defines a 
synchronization (that is a 1:1 map) between the world lines of the RIO X, &’. If we 
consider a number of equidistant points 1, 2,3,... along the worldline of & then 
with the Einstein synchronization the corresponding points 1’, 2’, 3’,... along the 
worldline of &’ are also equidistant due to the constancy of the velocity of light and 
the constancy of the relative velocity of the observers. This implies that if t, t’ is 
the proper time of X, &' respectively then the Einstein synchronization is expressed 


212 7 Four-Acceleration 


analytically with the relation?: 
To = k(vy’y)T> (7.63) 


where k(vyy) is a function of the relative velocity of X, 4’. Since &, X’ must be 
kinematically equivalent we require 


K(vy’y) = k(ugy) = kv) (7.64) 


where v is the relative speed of X=, &’. In order to define the function k(v) we require 
that k(v) is compatible with the chronometric measurement of the coordinates of 
events in spacetime discussed in Sect. 5.4. 

We consider two light signals sent from © to D’ at the events P, Q whose proper 
time interval between is Atpg. The second signal is received by DX’ at the event 
Q’ where the light cone with apex at Q meet the world line of X=’. The proper time 
difference of PQ’ according to (7.63) is ATtpo = k(v)Atpg. In order to compute 
the speed of RIO ’ wrt the RIO X we consider that X’ sends from the events Q’ 
a light signals which reaches the world line of RIO © at the point Q;. The proper 
time difference of the events P, Qyis (see Fig. 7.5): 


Atpg, =k) Atpg = k?(v)Atpg. (7.65) 


According to chronometry the spacetime coordinates of the event Q’ for © are 
(ctg/(Z), xg/(Z)) where® 


1 1 
to(Z) = 5((Atpg, — Atpg) + Atpg) = 5) + lAtpo 

1 i 
xg’ (X) = 3 AtPa, = Atpg)c = 5K) = l)cAtpg. (7.66) 


Therefore the (constant) speed of X’ wrt ¥ is 


_ Axg(Z) R(v)-1 
~ Aig(®) @)+1— 


(7.67) 


This relation provides the link between the speed v and the factor k(v). Inverting 
this relation we find 


5The coordinate time of events are related by the Lorentz transformation i.e. ct! = y(ct —r- B). 
®We assume that P is the origin of the coordinates. 


7.5 Synchronization 213 


Fig. 7.5 Chronometric 
coordination Q; 
Q' 
Q 
P 
y 
x 


This expression coincides with a special form of the Doppler effect (to be discussed 
in subsequent chapters ...) under certain kinematic arrangements. This coincidence 
has led to a misunderstanding concerning the role and the significance of the k(v) 
factor. Indeed the Doppler factor concerns the frequency (the zero component of the 
four frequency vector) and involves one light beam and two observers. However the 
role of the k(v) factor is to define in a coordinate free way a 1:1 relation between 
the proper times of two RIO or, equivalently, a map of the world lines of two RIOs 
using beams of light. In this sense the Doppler expression is coordinate dependent 
whereas the k(v) approach is covariant and coordinate independent. 


7.5.2 K — Calculus 


It is possible to use the k(v) factor and develop many of the concepts of Special 
Relativity in a coordinate free manner.’ It is emphasized that the k — calculus is 
limited to general arguments and simple arrangements of purely kinematic nature 
but it is useful because it introduces the concepts independently of the coordinates, 
that is the “time” and the “space”. Let us see briefly some applications of this 
approach. 

We start with the derivation of the Lorentz transformation. Consider the event R 
say whose (chronometric!) coordinates for & are (ct, x)». Then it is easy to see® 
that the chronometric determination of the coordinates requires that the light signals 


7This approach to Special Relativity has been developed mainly by the English school of relativists 
and specifically by H. Bondi. See Brandeis Summer Institute in Theoretical Physics (1964) Volume 
1, Lectures on General Relativity p.375. For more recent information see also http://en. wikipedia. 
org/wiki/Bondi_k-calculus. 

8Indeed the spatial coordinate of R is measured in the middle of the distance between the points 
Trecept aNd Temis Which is 5(ct +x — (ct — x)) = x and this is at the time moment 5 (ct +x- 
(ct —x)) + (ct —x) =ct. 


214 7 Four-Acceleration 


Fig. 7.6 k-Calculus 


are sent to the event R at the moments Temis. = ct — x and Trecept. = ct + x (see 
Fig. 7.6). 

Consider now the second RIO ©’ whose world line intersects with the light 
signals used for the chronometry of X. Let (ct’, x’)y, be the coordinates of the 
point R for X’. Then with the same argument the points of intersection will be at 
the proper times” t),57 = ct’ — x’ and 12,5, = ct’ +x’. Then using the k— calculus 
from the Fig. 7.6 we have for the proper lengths OR’ and OQ’ : 


ct'— x’ =k(v)(ct —x), ct+x=k(v)(ct’ +x’) (7.69) 


where the events P, Q’ are considered as emission and the events P’, Q are 
reception. 


Exercise 7.5.1 Replace in these relations k(v) from (7.68) and derive the boost 
along the x — axis. Then show the invariance of the Lorentz interval c*t’? — x'? = 
raz". 

Next we consider the relativistic composition of velocities. For this we consider 
three RIOs X, X’, XU” who are related with the k— factors ky yy, ky yp, ky yp. 
Then & sends two light signals with proper time difference T which are received by 
both X’, X”. According to k — calculus the proper time difference for observer D’ 
is ky yT and for observer U” is ky yvky yT if considered as emitted by &’ and 
ky. yvT if considered as emitted by &. Obviously one arrives at the relation 


kp pv = ky rks (7.70) 


These times correspond to chronometric measurement of the coordinates of R by D’ since again 
we have a light signal send by &’ and received by &’ from the event R. In other words we have a 
common chronometric measurement of the coordinates of the event R by 4, D’. 


7.6 Rigid Motion of Many Relativistic Mass Points 215 


which expresses the relativistic rule of composition of 3-velocities in the language 
of k—calculus. In order to derive the well known formula (see (6.13)) we use (7.67) 
and have 


ltuyy sv 1l+v , 
2 ) 2 Dap» DD 
B KS yn —_ 1 kSy sks 5) z,2") — 1 levy pv I-vy yy 1 
un" = 79 ~ 72 2 ~ 7 T+ 

kS yn + 1 Sy sks 5) x,a") + 1 eet aw wt 

DD! vy >! 
(7.71) 
Exercise 7.5.2 


a. Prove that (7.71) coincides with relation (6.13). 

b. Using that the velocity factor B in terms of the rapidity $ is given by the relation 
B = tanh@ prove that k = e®. Then show that relation (7.70) in terms of the 
rapidities becomes 6y,5" = oy,5" + $y, x) (compare with Example 6 Sect. 1.9). 


7.6 Rigid Motion of Many Relativistic Mass Points 


To see the necessity of rigid motion in Special Relativity let us start with the 
following plausible situation. Suppose a spaceship moves with high speed in space 
in a long journey. We expect: 


a. That during its motion the spaceship will accelerate and decelerated as required 
while traveling 

b. The spaceship will not tear apart during the course of the motion and in fact 
neither the astronauts nor the various instruments will change shape or become 
shorter or longer, i.e. in general, the spaceship and all its contents will be “rigid” 
in the Newtonian sense. 


Obviously this contradicts Special Relativity due to length contraction. On the 
other hand it is a necessity, otherwise there is no point for us (the Newtonians!) 
to develop space traveling! The first to realize this situation was Max Born, a 
German physicist, who shortly after the introduction of Special Relativity!® (i.e. in 
1910) asked if the untenable concept of “rigid body” can be generalized in Special 
Relativity to the concept of “rigid motion”, that is a relativistic motion in which the 
spatial distances of a cluster of ReMaPs remain constant during the motion. It turned 
out that such motions are possible (under special conditions, not in general (!)) and 
have been called Born rigid motions. In the following we discuss Born rigidity for 
various types of motion. We start with the simple hyperbolic motion, continue with 
an arbitrary one dimensional motion, and end up with the rotational motion. 


'0Born, Max (1910), “Zur Kinematik des starren Kérpers im System des Relativitéitsprinzips” 
[Wikisource translation: On the Kinematics of the Rigid Body in the System of the Principle of 
Relativity], G6ttinger Nachrichten, 2: 161-179. 


216 7 Four-Acceleration 


However this is not the end of the story. Indeed, how can one keep time in a 
spaceship? Due to the time dilatation effect two clocks set at identical readings at 
different points along the spaceship, will give different indications as the motion 
occurs. Which is correct and which is false? Which clock the should astronauts 
should believe? The situation can be dealt with by an “equivalence” of clocks 
and this is what the synchronization is all about. We will define a number of 
synchronization procedures showing that this concept is conventional and indeed 
it is the transfer of the concept of absolute Newtonian time in Special Relativity. 


7.7 Rigid Motion and Hyperbolic Motion 


Consider two ReMaPs 1, 2 which depart from rest at positions x91, x92 and move 
along the x—axis with constant proper accelerations ay and a respectively. 
According to the results of Sect. 7.4 the world lines of the ReMaPs are given by 
the equations: 


where (cf, x) are coordinates in & and qj = =. Figure 7.7 shows the worldlines of 


the particles in the Euclidian plane (ct, x). The world lines are hyperbolae which are 


Fig. 7.7 General non-rigid motion 


7.7 Rigid Motion and Hyperbolic Motion 217 


asymptotic to the light lines through the pivot event xo; — ra q' is the i particle’s 


proper distance from the pivot point (with respect to it’s instantaneous rest frame) 
which is constant. 

We consider a point A; along the worldline of ReMaP | and extend O; A, until 
it intersects the worldline of ReMaP 2 at the point A’. The length A; A/ is the 
distance of the two ReMaPs as measured by ReMaP 1. If we draw the horizontal line 
from A‘ to the axis ct we define the point A‘ of the worldline 1. The length A‘ Aj 
is the distance of the two ReMaP at the moment cf; of X&. As ReMaP 1 ‘moves’ 
along its worldline the distance between the two ReMaP changes and eventually 
becomes infinite when the projection of the ‘radius’ of ReMaP 1| does not intersect 
the worldline of ReMaP 2. E.g. B, Bi’ > A,Aj. Note that the line 0; Aj is the 
x-axis of the ReMaP 1. 

The crucial point is that all hyperbolic paths asymptotic to the same light lines 
share the same pivot event, P say, hence maintain a constant proper distance from 
that event and, therefore, from each other. Hence, in order two particles, the 1,2 say, 
which accelerate with constant proper accelerations a as respectively along the 
x—axis to maintain constant proper distances from each other, the magnitude of the 
constant proper acceleration of each particle must be inversely proportional to it’s 
(signed) distance from the common pivot event. In other words, if xo;, i = 1, 2 is 
the location of the ith particle at time t = 0 of &, then this particle’s constant proper 
acceleration must be 


at =c*/(xoi —xp),i = 1,2 (7.72) 


where xp is the location of the pivot point on the x—axis. Notice that the magnitude 
of the acceleration goes to infinity at the pivot event, and also changes sign, so 
particles on opposite sides of the pivot event are accelerating in opposite directions. 
Choosing our initial coordinates such that the pivot even is the origin, the worldlines 
of several uniformly spaced particles undergoing this kind of “Born acceleration” 
are shown in Fig. 7.8. 


Fig. 7.8 Born acceleration 


218 7  Four-Acceleration 


Assume now that the two ReMaP 1, 2 are the end points of a “rod”. Can this 
rod be rigid? The answer is “no” because the rigid rod (in the standard Newtonian 
sense) assumes invariance of its length under the Galileo group.'! The next question 
is: Can this rod move in such a way so that it will “appear” to be rigid to the REMaP 
1 (and of course 2)? The answer is “yes” provided that the world lines of 1, 2 share 
the same pivot point and the value of their proper accelerations is given by (7.72). 

To connect the result with the previous considerations we conclude that the 
astronauts in the spaceship will have the impression that the spaceship is like a 
rigid Newtonian structure provided the proper accelerations of any two points 1, 2 
of the ship satisfy the condition. 


at — at = c?/(xo2 — x01) (7.73) 


where we have assumed that the pivot point is the origin of the coordinates. Because 
the speed of a particle with hyperbolic motion is u = tanh ae (see (7.60)) we see 
that the speed of the particles of the spaceship for © varies from point to point across 
the spaceship. !? 

This type of motion is rigid motion in Special Relativity is called Born rigid 
motion. The distance AA’ we call the proper length of the (relativistic) rod 
1,2. Concerning the motion of the points of the rod we say that the rod is Born 
accelerated along the x—axis of the RIO &. Note that in & the spatial dimension 
of the spacecraft changes! 

In order to see geometrically the Born rigid motion we consider the ReMaPs 1,2 


to share the same pivot point P defined by the condition xo; + < = x0,2+ < (See 
1 2 
Fig. 7.9). 


: 2 2 
In this case the two hyperbolae are “parallel” and xo,2 — xo0,1 = = _ = = 
2 1 


q2 — qi. It follows that the spatial distance AY — A of the ReMaPs 1,2 as seen in 
the proper frame of either | or 2 is constant and equals the initial length x9,2 — xo,1. 

The next step is to define a synchronization of the proper times among the points 
of the rod. This synchronization will define the “proper” time of the rod and will 
correlate the kinematics of the various parts of the rod. It is possible to define two 
different types of synchronization for 1, 2. 


'lIn a sense we may say that Newtonian rigidity is treated as the Newtonian time hence it needs 
a kind of “synchronization” in order to be used in Special Relativity. That is, just as different 
observers disagree about the time coordinate of an event, different observers could disagree about 
the length of a Newtonian rigid rod. 


!2These results hold for one dimensional motion, hence they are ideal! 


7.7 Rigid Motion and Hyperbolic Motion 219 


Fig. 7.9 Born rigid motion 


7.7.1 Born Synchronization of LRIO 


Let A, be an arbitrary event along the worldline of ReMaP | at the proper moment 
tT; and let A’ be the corresponding event on the worldline of 2 (see Fig. 7.9), which 
is defined by the extension of O; A;. From (7.60) we have: 


ay ay 
va, = = aft = att. (7.74) 


Relation (7.74) defines a diffeomorphism between the worldlines of 1, 2 hence 
a synchronization of the proper times of 1,2. This synchronization we call the 
synchronization of the LRIO. The name is due to the fact that the line OA, Aj 
defines the x’—axis of the LRIO of the events Aj, A{. In terms of proper times the 
synchronization is expressed by the relation: 


a 
™2 = TTI (7.75) 
ay 
and, in terms of proper time intervals: 
+ 
81) = bn, (7.76) 
Go, 


Obviously 6t2 > dt, assuming that ay < ie Relation (7.76) means that the 
rate of the proper clock of | is slower than the rate of the proper clock of 2, the 
difference being measured by the quotient of the proper accelerations ay ; ay of 1, 2. 
This difference of proper clocks we call acceleration time dilatation. It is apparent 
that this dilatation has nothing to do with the Lorentz time dilatation, which is based 
on the Einstein synchronization and assumes inertial motion of the clocks 1, 2. 


220 7 Four-Acceleration 


In order to express geometrically the acceleration time dilatation we write 
condition (7.76) in terms of spacetime angles and arcs. From (7.60) and (7.76) we 
have: 


OWarp = OWaB. (7.77) 
hence: 
—+ + x02 
SAB ay 
] = : (7.78) 
A'B [oe + x01 
1 


The synchronization of the LRIO is not the only one possible, but it has physical 
significance, because it is related directly with the observation of the spatial length 
by the LRIO of the rod. We note that even if the ReMaP have the common proper 
acceleration at, the rates of time for the two ReMaP are different due to the terms 
X01, X02. Hence it is not possible to have a “common” time for the two ReMaP when 
the synchronization of the LRIO is used. The rates of time are also different in the 
case of Born rigid motion. 


7.7.2 Synchronization of Chronometry 


Consider two ReMaP 1, 2 which move with Born rigidity. Consider the events A, B 
along the worldline of the ReMaP | (see Fig. 7.10) and assume that two light signals 
are sent from the current location of ReMaP 1 towards the ReMaP 2 and reach the 
worldline of 2 at the points A’, B’ respectively. The lines AA’, BB’ are parallel to 
the surface of the light cones with vertex at the events A, B respectively. Obviously 
this procedure defines a diffeomorphism between the worldlines of 1, 2 therefore a 
synchronization which is expressed with the following relation: 


Tal = TA. (7.79) 


This synchronization we call the synchronization of chronometry and coincides 
with the synchronization of the LRIO for this type of motion (see Fig. 7.10). 

Let us compare the arcs s4g and s,4’pv along the worldlines of land 2. These are 
the rates of the proper clocks of 1, 2. Assuming that the points A, B are “near” we 
write: 


Sa’Bi = G2(WB — Wa) (7.80) 


SAB = qi (WB — Wa) (7.81) 


7.7 Rigid Motion and Hyperbolic Motion 221 


ay 


Fig. 7.10 Synchronization of chronometry 


where yw; is the rapidity of the event i (i = A, B, A’, B’). Using (7.60) and (7.74) 
we find: 
SA’ B! _ SAB 
ay (ty —Ta) af (te - TA) 


(7.82) 


that is, the quotient ake is invariant, justifying the name for this synchronization. 
We note the difference between the synchronization of the LRIO and that of 
chronometry. Which one we should follow is a matter of choice, convenience and 
above all physical reality. For example if we design the cardan of the proper clocks 
where the handles of the clocks move to be as the hyperbolic worldlines and the 
rate of the clocks is the same, then if the indication of the clocks is the same at one 
moment it will stay the same at all future moments. 


7.7.3 The Kinematics in the LCF > 


The kinematics of a set of ReMaP in & does not depend on the synchronization 
chosen to relate the proper clocks of the particles. Indeed in & observation is done 
chronometrically therefore the quantities involved are only the coordinate ones. For 
example let us examine the velocities and the accelerations as observed in & of the 
ReMaP 1, 2, which move hyperbolically with the same proper acceleration a. When 
the positions of 1, 2 are observed simultaneously in & the ReMaP | is at the event 
A‘, and the ReMaP 2 at the event A‘ (see Fig. 7.7). The events A;, A‘/ do not have 


222 7  Four-Acceleration 


the same rapidity hence the observed velocities and accelerations of 1,2 in & are 
different. More specifically we have: 


¢ Velocity of | as measured in ©: uj = ctanhy. 
* Acceleration of 1 as measured in D: aj = at cosh yy. 
¢ Velocity of 2 as measured in ©: u2 = c tanh Wo. 
* Acceleration of 2 as measured in D: a2 = at cosh Wo. 


Using the results of Example 7.4.1 we write these quantities in terms of the 
proper times of 1, 2: 


at at 
+ 
u,; = ctanh{ —T a, =a" cosh{ —T, 
c Cc 
at 2 at 
uz = ctanh | —7 az =a' cosh| —T}. 
Cc Cc 


We infer that © measures the relative velocity: 


at at 
U2, =u2—-—uy=Cc ann ( n) tanh ( n)| (7.83) 
Cc Cc 


and the relative acceleration: 


at at 
aj) =a, —a, = at [cosh (<2) — cosh (<s)| : (7.84) 
Cc Cc 


These equations can be written as follows: 


c tanh [= (t — 11) | 


1 — tanh (<0) tanh (<1) 


(7.85) 


“u21 = 


at ar 
a2 = 2a™ sinh} —(t — 11) | sinh | —(% + 71) |. (7.86) 
2c 2c 


Example 7.7.1 A rod of length / is resting along the x—axis of the LCF &. At 
the time moment t = 0 of & the rod (that is all its points) starts to accelerate with 
constant proper acceleration. Assuming that the (Euclidean) proper length of the rod 
does not change during the motion (Born rigid body motion) calculate the length of 
the rod in & at the time moment t (of & !). 

Solution 

We apply the boost, relating the proper observer of the rod (LRIO!) &* with the 
RIO in ¥, to the four-vector defined by the end points A, B of the rod. In &* we 
have (AB)* =1*, d¢* = tf —t} #Oandin D (AB) = 1, 5m = tay — tay =0 


7.7 Rigid Motion and Hyperbolic Motion 223 


Fig. 7.11 Measurement of length of a Born accelerated rod 
: ; : + 
because the ends A, B are observed simultaneously in &. The boost gives / = c 


a\ 1/2, . ; ; 
where y = ( _ *) is a function of t (in ¥!). Because J+ is assumed to be 


constant we differentiate this relation and get dl = — Cd y . Integrating in the region 
[0, ¢] and making use of the initial conditions v(0) = 0 and /(0) = /* it follows: 


1 1 [+ 
W(t) =1t +17 ( ) = 
Va KO) 


We note that at every moment this implies the standard Lorentz contraction with 
a y—factor depending on the time ¢ in &. Obviously as t — oo the y(t) — oo 
hence / — 0 (see Fig. 7.11). 


Example 7.7.2. A rod is resting along the x’—axis of a RIO ©’ when it starts moving 
(all its points!) with constant acceleration a parallel to the y’—axis. At the same 
time (in ©) the LCF ©’ starts moving in the standard configuration wrt © along 
the common x, x’—axis with speed v. Find the equation of motion of the rod in ©. 
Comment on the result. Examine the case that the rod is not accelerating, that is 
a=0. 

Solution 

In &’ the equation of motion of an arbitrary point of the rod is: 


y’ = =at’, x’ =constant, z’ =0. 


224 7 Four-Acceleration 


The boost relating X’, © gives: 


x = y(v)(x' + Ber’) 


1 
= at’; 
YS y 2 


ct’ = y(v) (ct - =x). 


z=7=0 


From the first of these relations we find: 


ct = Z (-¥ + =) 
iB y(v) } 


Using this, we eliminate r’ from the other two equations and get: 


2 
ope ( ci ) (x = y(v)x'. 


~ 2° 2 pe yo) ~ 222) — D 


This equation shows that in the plane x — y of & the rod appears to be a parabola 
(see Fig. 7.12). 

This “distortion” of the shape of the rod in & is due to the fact that the rod is 
“seen” in &, that is, the events defined by the points of the rod are simultaneous in 
D’ and not in &. As a result the points that are further away appear “later” in X, 
hence the distortion of the shape of the rod in &. 

We consider now the case a = 0, that is, v =constant. Then we have: 


hence: 


Fig. 7.12 The accelerated 4 
rigid rod along the y—axis 


Orbit of B 


rbit of A 


7.7 Rigid Motion and Hyperbolic Motion 225 


Fig. 7.13 Rigid rod moving a 
inertially along y—axis 


Orbit of A Orbit of B 


The end point A has x’(A) = 0 therefore y(A) = 4a) Similarly for the end point 


(v)° 
B of the rod we have x’(B) = L hence y(B) = 4( ah — L). In the plane x — y the 
curves (x, y(A)), (x, y(B)) are parallel straight lines with slope (see Fig. 7.13 ): 


dy u 
tang = — = ‘ 
dx vy(v) 


We conclude that in © the rod moves parallel to the x—axis but moves at fixed 
angle ¢ in the x — y plane as shown in Fig. 7.13. Assuming that the end points A, B 
are simultaneous in &’ (so that the length of the rod is L) we have t’(B) = t’(A) 
from which follows: 


#68) — 0A) _ 9!(B) — x!(A) = L = y(A) = y(B) 
y(v) 


and the length of the rod in & is: 
x(B) —x(A) = yQ)L. 


This result does not conflict with the Lorentz contraction, because in © we do 
not measure the length of the rod — that is the spatial distance of the events A, B 
(because t(A) 4 t(B)) — hence x(B) — x(A) is the coordinate length of the rod in 
= not the chronometrically measured length. 


7.7.4 The Case of the Gravitational Field 


In order to obtain a feeling of the kinematic results obtained in the previous section 
we consider an one dimensional rocket of proper length Jp, which in some RIO & 
starts from rest and moves rigidly. Due to the type of motion the observer inside the 
rocket will notice no change in the spatial distances inside the rocket. What can one 
say about the (proper) clocks within the rocket? 


226 7 Four-Acceleration 


If we consider the synchronization of chronometry then if the clocks at each 
point inside the rocket are set at the same indication and their handles are moving 
along the worldline of the clock (if the clocks are digital then their reading must 
change accordingly), that is, they will always show the same indication, because 
this synchronization is independent of the position within the rocket. If the same 
clocks are synchronized with the synchronization of LRIO then, if they start with the 
same reading then according to (7.78) after a while they will have different readings 
depending on their position within the rocket. Obviously the synchronization of the 
LRIO could create confusion for the crew inside the rocket. 

Due to the rigidity of the motion there is a common LRIO for all points of the 
rocket. If this observer uses his clock to time the events into the rocket then he will 
make use of the synchronization of the LRIO. This synchronization is closer to the 
Newtonian concept of absolute time and perhaps it will be the “natural” one to be 
used by a single traveler. However if there are more than one travelers in the rocket 
who set their clocks at the same reading and move into the rocket, after a while when 
they meet again they will find that their clocks have different readings! It becomes 
apparent how crucial is the synchronization procedure (the time keeping of more 
than one clocks) and furthermore how conventional it is. 

Let us assume that the observer chooses the synchronization of the LRIO and 
examine what happens with the light signals inside the rocket. 

Consider two identical oscillators placed at a fixed distance i within the rocket 
and let Tt), t2 be their periods according to the observer in the rocket. Due to either 
synchronization the period of each oscillator depends on its position within the 
rocket. We stipulate that the number of complete oscillations of 1 and 2 is the same, 
because this has to do only with the internal function of the oscillators and they are 
assumed to be identical. Then according to the time keeping of the observer in the 
rocket, for n periods oscillator 1 requires a duration nt, and oscillator 2 a duration 
nt. From (7.78) we have: 


(7.87) 


This frequency shift Av = v; — v2 is not due to the relative motion of the oscil- 
lators, but to the existence of the acceleration and the considered synchronization of 
the clocks in the rocket. This change of frequency we call acceleration red shift. 

One practical application of this result is when the rocket falls freely in a weak 
gravitational field. In this case the acceleration can be considered as constant in the 
dimensions of the rocket (rigid body motion) and equation (7.87) means that inside 
the rocket the frequency of the oscillators changes with the position. To show that 
this is the case, it is enough to send one photon from one end of the rocket towards 
the other and observe if the photon changes color or not. As we shall discuss later 
this occurs and this phenomenon is known as gravitational red-shift. It is one of 
the phenomena, which indicated the Equivalence Principle of General Relativity, the 
later being stated as follows: 


The local Physics in a gravitating field can be described equivalently with the Physics in a 
properly accelerated coordinate system in a space free of gravitational field. 


7.8 General One Dimensional Rigid Motion 227 
7.8 General One Dimensional Rigid Motion 


In this section we generalize!> the study of rigid motion by considering two ReMaP 
A, B which are moving arbitrarily along the x—axis of a RIO & their position being 
described by the functions x(t), xg (ft) respectively. We shall say that A, B undergo 
rigid motion if in the proper frame of A the distance of B from A is constant and 
equal to Lo during all motion. 

We note that the requirement of rigid motion does not refer to the RIO &, hence 
in & the two particles can have different velocities and accelerations. 

Let us formulate the conditions of rigid motion in terms of the kinematic 
variables. 

At the time moment ¢ in © let 64 (t), y4(t) be the kinematic factors of the proper 
frame X(t) of A at that moment. Because we measure length in X(t) the events 
must be simultaneous in that frame, hence not simultaneous in &. We have the 
following Table of coordinates: 


x Da (t) 
A: (ct, xa(t)) (cta, 0) 
B: (ctg, Xp (tp)) (cta, Lo) 
BA’: (ctg — ct, xp(tp) — xa(t))z (0, Lo) saw 


The boost relating ©, U4 (t) gives: 


ctg = ct + ya(t)Ba()Lo (7.88) 
Xp(tB) = xa(t) + ya(t) Lo. (7.89) 
Because we want tg > 0 we demand the restriction 


ct 


————_., 7.90 
~ Va Ba) ey) 


Lo 


which sets an upper bound for the value of the distance L,. Eliminating tg we find 
in & the following equation for the motion of B : 


: (: 7 MO FiO") Sune Pal 26. (7.91) 


We compute: 


dxp dtg dx a(t) dya,(t) 
= + Lo 
dtp dt dt dt 


(7.92) 


'3See D. Kim and Sang Gyo Jo “Rigidity in Special Relativity” J. Phys. A: Math Gen. 37 (2004) 
4369. 


228 7 Four-Acceleration 


But from (7.4) we have v4) = =y7 (t)B,(t) SB) Hence the rhs gives: 


2A) 4 erp PaO 1, — 249 (14 yaar Lo 
dt rae dt c 
d vj jog bo 
Using (7.88) we prove easily that fot =144 . Hence imposing the 
restriction: 
37) dba) 
tL 
1 4 LAOT a 10 14 (7.93) 
c 
(7.92) gives: 


dxp(tp) — dxa(t) 
dtp — dt 


(7.94) 


which implies that the velocity of B in & is the same as the velocity of A in X. We 
infer that the proper frame A is also the proper frame of B, therefore it is enough to 
consider the rigid motion in the proper frame of one of the particles, both particles 
being equivalent in that respect. 

This equivalence can be seen formally as follows. Equation (7.94) implies 
Batt) = Ba(ts), vat) = ve (tp) hence if we replace ct = ctg — ya(t)Ba(t)Lo = 
ctp — yB(te)PB(tB)Lo in (7.89) we find: 


L 
‘es (« + eae tinh tLe (7.95) 


The above results are general and hold for any motion of A, B along the x—axis 
of &. 


7.8.1 The Case of Hyperbolic Motion 


Let us apply the above results in the special case A executes hyperbolic motion with 
proper acceleration ae As has been shown in (7.50) in this case: 


(7.96) 


2 
xa(t) —xa(0) = = 
a 


7.8 General One Dimensional Rigid Motion 229 


For this type of motion we also have (see also (7.43) and (7.44)): 


+ - 
ya()Ba(t) = sinh “A* = At (7.91) 
+ + 
ya(t) = cosh “A = “4 (xa(t) ~ x4(0)) + 1. (7.98) 


We examine conditions (7.90) and (7.93). The first gives: 


oF 2 
a,t c 

ct+—-- Ip >0> 19 < 5. (7.99) 
Cc ah 


Concerning condition (7.93) we note that aay 3(t) = an > Fao VA s(t) = ay. 
Therefore this condition reads: 


a, Lo 


L= 0 
2 - 


and is trivially satisfied due to (7.99). 
Concerning the motion of B we have from (7.88): 


TT, 
oe (: |e (7.100) 
Cc 


and from (7.89) and (7.98): 


+ Lo 


ses 
XB (tp) = xa(t) (: + “5 ) + Lo- xa(0)-45 (7.101) 


In order to compute xz (tg) in terms of the time tg we use (7.100) to replace ¢ in 
the rhs of (7.101). Then: 


tp a Lo a Lo 
XB (tp) = XA Se (14 a ) + to sa A 
1+ “4;° 


From (7.96) we find: 


t 
: 1] +x,4(0) 


230 7  Four-Acceleration 


hence: 


XB (ta) = at 


B+] wv 
— 
+ 
aw 
lo) = 
——— 
i) 
——~ 
— 
+ 
~ 
aft & 
Se 
o 
Ne 
ie) 
pan 
ee 
io 
g 
> |[>+ 
i. 
ro) 
ee 


+ + 
a,L a,L 
+L — xa) AS + x40) (1+ A) 
c c 
c ai Lo ‘ at? 1 
== 1445 + 44-1] +240. 
ay c c 
We rewrite this expression as: 
2 
(tp) cf ea aA ce (0) (7.102) 
x = % , 
BUB at 1 +a Lo/c? c2 A 


I+atLo/c2 


from which we infer that the particle B is also executing hyperbolic motion with 
proper acceleration: 


+ a4 
ay = ————_.. 7.103 
aa | +a, Lo/c? ‘ 


7.9 Rotational Rigid Motion 


We generalize the previous considerations to rotational rigid motion. Consider two 
ReMaP A, O such that O is resting at the origin of the coordinates of a RIO & and 
A rotates in the x — y plane of & in a circular orbit of radius a. At each moment the 
velocity of A is normal to the radius in &, hence there is no Lorentz contraction in 
the radial direction for the LRIO at that point. This means that our assumption that 
the radius a remains constant is not incompatible with the Lorentz contraction. We 
shall refer to this type of motion as rigid rotation. 

We consider now a third ReMaP B which rotates rigidly around O in the x — y 
plane of & at a radius b > a. We want to define the rigid motion of the pair A, B. 
For this we consider a time moment fg in & and choose the axes x, y of & so that at 
to, Ais on the x—axis while B makes the angle ® g (fo) with the x—axis. Let X(t) 
be the LRIO of A at the moment fp. Then the velocity of X&4 (fo) wrt & is ao A (toy 
where y is the unit vector tangent to the orbit of A. We have the following Table of 


7.9 Rotational Rigid Motion 231 


coordinates between the RIO & and the LRIO &,y (fp) at the moment fo of & (the z 
component is suppressed): 


x Za (to) 
A: (cto, a, 0) (cta, 0, 0) 
Bs (ctp, bcos ®g, bsin Dz) (ctg, Xp (tp), Vp (tB)) 
ctg — Clo ctp at cta 
AB’: bcos ®g —a Xp(tp) 
bsin ®g 7 yp (tB) a(t) 


The boost along the y—axis gives: 


ctg — cta = ya(to) (ctg — cto — Ba(to)b sin ®z) (7.104) 
XB(tp) = bcos ®g —a 
Ya(te) = ya(to) [b sin ®g — Ba(to)(ctg — cto)]. 
We define the rotational rigid motion of the pair A, B wrt & by the require- 
ments 


a. The angle 64, made by the radii OA, OB at to will be constant during the rigid 
rotational motion of A, B. 
b. The distance of A, B in & and/or X 4 (fo) is the same and it is a constant. 


We note that the distance of A, B in the frame & equals: 
a’ +b? — 2ab cos PAB 


and it is compatible with assumption a. 

To formulate the second requirement we recall that the measure of the spatial 
distance of two events by a LRIO requires that the events are simultaneous in that 
LRIO. We consider the measurement of the distance first in © and then in Vy (fo). 


Measurement of the distance AB in X In this case we have for the measurement 
of the distance of A, B in D: 


tp = to. 
Then (7.104) implies: (this is (2) of Cron) 

ctp — cta = —Ya(to)Ba(to)b sin Bg (7.105) 
from which we have: 


c(tg — ta) 


= a (7.106) 
bya (to) Ba (to) 


sin Pp = 


232 7 Four-Acceleration 


We consider the effects of requirement b. in X 4 (fo). 
The spatial coordinates of the event B in 4 (to) are: 


Xp(ig) = bcos ®g —a (7.107) 


ya(tB) = ya(to)b sin Pz. 
The coordinate distance of A, B in X 4 (fo) is 
(a(@n))” + (5a(éR))” =a>+b* —2cos dap. 
Replacing we find: 
(bcos ®g — a)? + [bsin gl? = a* + b* — 2cos hap. 
The first term: 


(bcos Bg =a = b* cos” ®p — 2abcos Og ge 


=p }1 ( (te — ta) ) 2ab cos ®g +a? 
bya (to) Ba (to) 


2 c* (tp = fie 


=a+b 5 5 2ab cos Bg 
V4 (to) B4 (to) 
where in the second step we have used (7.105). 
The second term gives: 
2 te — tay’ c? (tg — fa)? 


[bsin®g? = 


b> yA (to)B4(to) 74 (t0) Ba (to). 
Replacing in the rigidity condition b. we find: 
207, _7,)2 27, —7,)2 
tp —t tp —t 
’ a fu S4beos by 4 <P os 
YA (to) Ba (to) YA (to) Ba (to) 


cos Bg = cos dap 


a’ +b 


=a’ +b? — 2ab cos @aB 


that is, the B is on the same radius for both © and X,4(to). However the time 
coordinate of B is different in & and D4 (fo). 

In & the orbit of B is a circle of radius b. In X 4 (to) the orbit of B is given by 
the equations 


Xp(ig) = bcos ®g —a (7.108) 


ya(tB) = ya(to)b sin Pz. 


7.9 Rotational Rigid Motion 233 
where ctg — cta = —ya(to)Ba(to)b sin ® g. It follows 


’ _— 
(pf) + a)” + =—Ypliz) = 


YA (to) 


This is an ellipse in the (x, y) plane with center at the point (—a, 0) and semi major 
axes b and yb respectively. 


Measurement of the distance AB in (to) In this case we have the condition 
ctg — ct, = 0. 


Then (7.104) implies: 


ctg — cto = Ba(to)bsin Pp (7.109) 
from which we have: 
: ctp — cto 
sin ®g = ————_. (7.110) 
bBa(to) 


The requirement of rigidity gives: 
(faa) + (Fa Ea))” = a? +b? — 2cosan. 
Replacing we find: 
(bcos ®zg — a)’ +[ya (to) (bsin Bg — Ba(to)(ctg — cto)) |" = a’+b*—2cos PAB. 
The first term: 
(bcos ®g — a)* = b* cos” ®p — 2abcos Pg + a’ 
= b? [ (o = 2) 2ab cos Pz ig? 
Ba(to)b 

etn — 0)" 


Bi (to) 


=e+p = — 2abcos Pz 


where in the second step we have used (7.105). 


234 7 Four-Acceleration 


The second term gives: 


2 
. ctg — ct 
[va(to) (bsin ®g — Ba(to)(ctg — cto) = yi (to) Bares — Ba(to)(cta — on) 
‘A (to 
2 2 
2 c’ (tg — to) 2 2 
= 4 (t.) —-— (1 — Ba (t0)) 
A Bi (to) * 
ee 
A(t) Ba (to) 
Replacing in the rigidity condition we find: 
2 2 2 2 
tp —t tp —t 
i eb = a — 2abcos ®p + ae =a+h*- 2ab cos baB 
B4 (to) BA (to) 74 (to) 
from which follows 
Cc? (tg — to)” + 2ab cos ®g = 2abcos b,p (to). (7.111) 


Conditions (7.109) and (7.111) are quantifying the rotational rigid motion of the 
pair A, B.. The first equation gives the time tg in & of the event B so that when the 
event A occurs at time fp in the x — axis of & both events occurring at the same 
moment in the LRIO X4 (fo). Given $4p(to) the second equation determines the 
angle ®g(t) of B so that the spatial distance of A, B will be constant for the LRIO 
X,(to) of A at the time moment fo. Since the time moment fo is arbitrary the above 
conclusion holds at all times in ©. 

From these conditions we have the following constraint resulting from the 
identity cos? ®g + sin? © B=l: 


c(tp — to) _ c2(tg — 19)? \* 
bBa (to) = ji (cosas => a ) (7.112) 


The instantaneous rest frame for B the moment fg coincides with the instanta- 
neous rest frame of A at the moment tg, hence what we have said for the second 
automatically applies for the first. We conclude that the definition of rigidity for 
the pair A, B we gave, is independent of which particle we consider as a reference 
particle. We also note that if a is a monotonically increasing function of time, 
then so is aot) 

The orbit of B in both © and Xy4(fo) is a circle however the first parameterized 


with ctg = cto + Ba(to)b sing, and the other with f,. 


7.9 Rotational Rigid Motion 235 
7.9.1 The Transitive Property of the Rigid Rotational Motion 


We consider four ReMaP A, B, D, O which are moving so that A, B are in rigid 
rotation about O, and B, D are also in rigid rotation about O at radius d > b > a. 
We define the transitive property of the rigid rotational motion by the requirement 
that if the pairs A, B and B, D execute rigid rotation in & then A, D also execute 
rigid rotation in X. 

Let us examine if this kinematic requirement can be satisfied and under what 
conditions. 

The conditions that the particles A, B are in rigid rotation about O implies at the 
moment fo of & are: 


d0(tg) _d@a(t), _ Balto) 
a a lip = Cc (7.113) 
ctgp — cto = Ba(to)b sin Dzg(tp). (7.114) 


Similarly, because the instantaneous rest frame for B the moment tg coincides with 
the instantaneous rest frame of A at the moment fp of &, for the pair B, D we have 
the equations: 


d0(ip) _ dOz(te) 
dtp 7 dtg 


ctp — ctg = Ba(tg)d sin[® p(tp) — Pa(ts)] (7.116) 


(7.115) 


where fp is the time in the rest frame of B. 
The transitive property will be satisfied if the pair A, D is rigidly rotating in &, 
which requires that the following two conditions must be satisfied: 


dO(tp) _ doa) 
dtp dt ” 
ctp — cto = Ba(to)d sin ®p(tp) (7.118) 


(7.117) 


where tp is the time of D in & the moment fp of &. From (7.113) and (7.115) 
follows: 


d0(tp) — dOa(ta) 
dtp ~ dta , 


therefore the first condition (7.117) is satisfied. 
We check the second condition (7.118). 
Subtracting (7.116) and (7.114) we obtain: 


ctp — cto = Bg(tg)d sin[®p(tp) — ®g(tg)] + Ba(to)b sin Pp (tg) 


236 7 Four-Acceleration 


Replacing 6g(tg) = b dae) = ae eee = Ba (ta) we find 


ss b 
ctp — cto = ~ Ba(ta) [d sinl Ppp) — Paltz) + asin Pats) (7.119) 


Obviously relation (7.119) coincides with (7.118) only for spacial cases. 

One important special case is when the three points A, B, D are along the same 
radius on the plane of rotation i.e. when ® p(tp) = Pg(tg) = Pa(t,). Then (7.119) 
becomes 


ctp — cto = bB (ta) sin ® p(t) 


and coincides with (7.118) provided that 


~ b 
tp =to+ qi — to) (7.120) 


This relation determines the time of every event D in the instantaneous rest frame 
of any other event B in terms of the time of the event in the frame ©. 

Obviously this condition is satisfied trivially in Newtonian Physics where time is 
absolute but in general not in Special Relativity. 

We conclude that in Special Relativity that rotation of a rigid rod about one of its 
points is a rigid rotation. Another case of rigid rotation in Special Relativity is the 
rotation of a rigid disc which rotates about its center. In the latter case during the 
rigid rotation the points in each radius stay at all times in a radius i.e. they have the 
same angular velocity. 


7.10 The Rotating Disc 


We come to the final case of our considerations that is the rigid motion of a rotating 
disc. This is a complex problem which appears to be still unsolved in the current 
literature, something to be expected because it requires so many conventions that the 
choice of a unique approach is not feasible. In any case it is an interesting problem 
and in this section we shall follow the rather standard and widely accepted — but not 
unique! — approach of Grgn. 


7.10.1 The Kinematics of Relativistic Observers 


Before we enter into the discussion of the rotating disc problem we have to advance 
a little our knowledge concerning the relativistic kinematics. Let us see why. The 
kinematics of Special Relativity developed in the previous sections was concerned 


7.10 The Rotating Disc 237 


mainly with the relativistic mass point (ReMaP). This kinematics has been extended 
with the Born’s rigid motion to cover the rigid body of Newtonian Physics. However 
the problem in Special (and General) Relativity is that one does not have solids, 
therefore one has to define the relativistic body, and then develop a kinematics 
appropriate for this type of body. This will become apparent from the discussion 
of the rotating disc. 

Kinematics is a comparative study which requires two observers. In Special 
Relativity there are two different cases to consider: 


a. Kinematics between two relativistic inertial observers (RIO) 
b. Kinematics between a RIO and an accelerating observer. 


In the first case the kinematics is determined by the Lorentz transformation 
relating the observers. That is, the choice of the second observer is equivalent to 
the choice of the appropriate Lorentz transformation (via the B—factor). In the 
second case the choice of the accelerating observer is at will because the Lorentz 
transformation does not relate a RIO with an accelerating observer, therefore in 
this case the kinematics is defined. This is done by the definition of a metric (of 
Lorentzian character) for the accelerated observer by means of two actions: 


a. The definition of a coordinate transformation which relates the coordinates of the 
RIO with those of the accelerated observer. 

b. By assuming that this transformation leaves the form of the metric the same 
(ds* = d5*), that is, the new metric is found from the Minkowski metric (of 
the RIO) if one replaces the RIO coordinates in terms of the coordinates of the 
accelerated observer. Using this new metric one determines the relation of the 
time and the spatial intervals between the two observers. 


One may consider the proposed coordinate transformation as “a generalized 
Lorentz transformation”. However this transformation is very different from the 
standard Lorentz transformation. Indeed the proposed transformation: 


a. It is specific to the two observers involved 

b. It is not an isometry of Minkowski space, that is, it is not represented with a 
Lorentz matrix !* 

c. It does not necessarily generates a (Lie) group, hence it is possible that one cannot 
define a covariance property.!> 


We note that the metric d5? is still flat because with a coordinate transformation 
it is not possible to create new tensors.!° 


'4For those who have a knowledge of geometry, it is not generated by a Killing vector of the 
Minkowski metric. 

'5For example the standard Lorentz transformations do form a group and the tensors which 
transform covariantly under this group are the Lorentz tensors. These tensors describe covariantly 
(within Special Relativity) the physical quantities of the theory. 

'OFlat means the curvature tensor vanishes. This means that the coordinate functions are defined 
everywhere in the space. Under a coordinate transformation it is possible to have new quantities, 


238 7 Four-Acceleration 


Having a metric for the relativistic observer it is possible to define coordinate 
time and the space intervals. This is important and it is done as explained below.!” 


7.10.2. Chronometry and the Spatial Line Element 


The coordination in spacetime by a relativistic observer (inertial or not) is done by 
means of chronometry. Let us recall briefly this procedure. The relativistic observer 
is equipped with a photon-gun (a torch) and a clock. In order to determine the 
coordinates of an event the observer emits a light beam towards the point in space 
where the event takes place and in the direction e such that the beam is reflected at 
a ‘mirror’, placed at the point of space and returns along the same direction back to 
the observer. The observer notes the reading of his clock for the event of emittance 
te and the event of reception ¢,. He also reads from the scale of the photon-gun 
the direction (angles) of the beam in his coordinate frame, & say. Subsequently the 
observer defines the following coordinates for the event in Z: 


tetly 
(Ge: ) . (7.121) 
7 Ce/s 


Using the chronometric coordination one can determine geometrically if a given 
relativistic observer & is a RIO or an accelerating observer. This is done as follows 
(see Fig. 7.14). Let P_ be an arbitrary spacetime event whose coordinates have being 
determined by &. Draw from P the normal plane to the world line of the observer. 
Let tp be the intersection of that plane with the worldline of the observer. If tp —te = 
t, — tp for all spacetime events P then the observer & is a RIO, otherwise it is an 
accelerating observer. 

Let us assume now that the metric of the relativistic observer — defined in the 
way explained above — in his coordinate frame, & say, has the form: 


ds* = gyydx"dx” + 2go,dx°dx" + goo(dx°)?. (7.122) 
The emitted and the received light rays by the observer performing the chrono- 
metric measurement of the coordinates of an event P, belong to the light cone of 


the point P. The equation of this cone is: 


Suvdx"dx” + 2gondx°dx" + goo(dx°)* =0 (7.123) 


which are not tensors, ( e.g. the connection coefficients I'}.), such as the curvature tensor. 
The reason for this is that the curvature tensor is covariant wrt the group of the general 
coordinate transformations (manifold mapping group) which includes the whatever coordinate 
transformations we introduce between the coordinates of the RIO and the coordinates of the 
accelerated observer. 


'7See also Ref: Landau & Lifshitz The classical theory of fields 5th Edition p. 234. 


7.10 The Rotating Disc 239 


Fig. 7.14 The coordination 
of an accelerating and an 
inertial observer 


Accelerated Inertial 
Observer Observer 


where dx° is to be taken as the time of traveling of the light beam to and from P 
and dx" are the coordinates defining the space axes of the coordinate system of the 
observer. Considering (7.123) as a quadratic equation in dx° and solving in terms 
of dx° we find: 


ree ee eee Vic = \idx#dx” (7.124) 
=o 200 &0u 80u80v §uv800 . . 


These values of dx° concern the bidirectional propagation of light, one value for 
the “go” and the other for the “return” trip of the light ray. The difference of the two 
roots gives the total traveling (coordinate) time of the beam. We find: 


Pade ge = dxMdx” 7.125 
XCD X4 x_ Joo ¥ |(80u80» Suv 800) | XU aX (7. ) 
where (C) is the event of emittance and (D) the event of reception. 
The proper time between the events C, D is obtained if we set dx“ = 0 in 
(7.122): 
ds* = —c"dt” = gon(dx4.3)” (7.126) 


from which follows: 
1 0 
dt = —,/|goo|dx-. (7.127) 
Cc 


This equation relates the proper time interval of the relativistic observer (that is, 
the time period measured by its proper clock) with the coordinate time determined 
by the chronometric procedure. 

Concerning the spatial distance of the event P we have (c = 1): 


i 0 
dlp = 5dxep = 


§0u80v 
&00 


= gyy|dx¥dx, (7.128) 


240 7 Four-Acceleration 


This leads us to consider the 3-dimensional line element: 


§0n80v 
§00 


dl’? = — guv| dxtdx” (7.129) 


as a positive definite line element, which defines the spatial metric of the relativistic 
observer!® D. 

Relations (7.127) and (7.129) describe the kinematics of a relativistic (inertial 
or accelerating) observer. The quantities dt, dl,, are the units of time and spatial 
distance determined from the accelerating observer and dx°, dx” the coordinate 
values of these quantities determined by the chronometric coordination. These 
relations provide us all the necessary equipment to discuss the rigid motion of the 
rotating disc. 


Remarks 


1. In the derivation we have considered axe. > 0, dx® < 0 but this is not necessary. 
ax, dx® can have any sign. The fact, that in this case the value of x°(D) at 
the moment of arrival of the light signal in D might be less than the value x°(C) 
at the moment of departure from C, contains no contradiction since the rate of 
clocks at different points in space are not assumed to be synchronized in any way. 

2. If we wish to find the spatial length d/,, along the coordinate x” say, we simply 
write: 


80u80 
aL 


dl, = dx (7.130) 
&00 


ML 


In this equation dx“ is the coordinate spatial length, that is the spatial length 

determined by the chronometric coordination. The above equation means that 

the spatial distance of the point P along the coordinate line x” attributed by the 

8080 

800 

along that coordinate line. 

3. If the observer is inertial then the metric has components diag(—1, 1, 1, 1) and 
we finddt =t, di? = duvdx" dx”, which are the expected answers. 


observer is 


- un times the coordinate length (i.e. coordinate unit) 


7.10.3 The Rotating Disk 


A circular disk of radius R lying on the plane x, y of a RIO & rotates with constant 
angular velocity w about the z—axis. The center of the disk is assumed to be at the 
point (— R, 0, 0). This description, although perfectly alright in Newtonian Physics, 


'8Not of spacetime! Each relativistic observer has its own spatial geometry! 


7.10 The Rotating Disc 241 


is meaningless in Special Relativity if we do not define(!) precisely for & what we 
mean by a rotating disk. Indeed in Newtonian Physics the disk is a solid, therefore 
we do not expect any change in the disk when it is resting and when it is set in 
rotation. However in Special (and General) Relativity there do not exist solid bodies 
and one has to define instead “solid motion”. 

There is not a unique or “the correct” definition of the rotating disk and one can 
define it as one wishes. Then the theory produces the consequences of the specific 
definition and the “experiment” tests the validity or not of the results. This means 
that all approaches/assumptions concerning the rotating disk are acceptable until 
experiment will select “the one”, if it exists. It is to be noted that the outcome does 
not prove or disprove Special Relativity, but concerns only the specific assumptions, 
which have been made in the definition of the moving body. If the experiment 
verifies these assumptions then Special Relativity is extended to apply to this type 
of moving body and all is well done. If not, then the assumptions are abandoned but 
Special Relativity stays! 

In the following we will consider a “standard” approach using the material we 
developed in Sect. 7.9 following to a large extend the approach due to Grgn.!? For 
further information on the problem of the rotating disc and other approaches the 
reader is referred to the work of R. Klauber.”° 

In the case of the rotating disk there are three types of observers which we have 
to consider. The RIO & for whom the rotating disk is defined (this is the reference 
RIO), a Locally Relativistic Inertial Observer (LRIO) defined by an accelerating 
point at the rim of the rotating disk and the accelerated observer rotating together 
with the rotating disk (considered to be the “proper” observer of the disc). 

The kinematics between the two RIOs is determined by the Lorentz transforma- 
tion corresponding to the velocity of the second LRIO. The kinematics between the 
standard RIO & and the accelerating observer requires: 


(a) The definition of the rotating disk for the accelerated i.e. (rotating) observer and 
through this, 

(b) The definition of the coordinate transformation, relating the coordinate systems 
of the two observers. This transformation determines a metric for the rotating 
observer via relations (7.126) and (7.128) and determines the kinematics of the 
accelerated observer in terms of the kinematics of the RIO. 


7.10.4 Definition of the Rotating Disk for a RIO 


When the disk rotates: 


a. It preserves its shape (that is it remains a plane circular disk as it rotates). 


!9 Gron (1975) ‘Relativistic description of a rotating disk’ Am. J. Physics 43 pp. 869-876. 
20R. Klauber ‘Relativistic rotation: A comparison of theories’ gr-qc /0604118 16 December 2006. 


242 7  Four-Acceleration 


b. The disk rotates with constant angular velocity about the fixed direction of 
Z—axis at its center. 

c. There is no relative rotation of the parts of the disk. Kinematically this means 
that the points on a radius remain on the radius during the rotation of the disk. 
As we have seen in Sect. 7.9.1 this is necessary if the transitivity property of the 
rigid rotational motion is to be satisfied and the points along a radius remain on 
a radius during rotation. 

d. The radius R of the disk when at rest and after it is made to rotate does not 
change.7! 


It is obvious that under the above assumptions for &, the disc is assumed to rotate 
as a relativistic “solid” (Born conditions are satisfied). Due to the above assumptions 
the angle a between two radii of the disk before and during the motion remains 
constant. This implies for the arc length across the rotating disk: 


ds 
—=d0 0<r<R. (7.131) 


r 


The angular velocity of the disk for the observer & is defined as follows: 


do 
Oat 


where ¢ is time in &. Replacing in (7.131) we find: 


(7.132) 


ds/dt _ 


r 


Qo. 


But ds/dt = v is the speed of rotation (at distance r form the axis of rotation), 
therefore: 


v=or. (7.133) 


The condition: 


c 
De Oe Oe 


restricts the values of the angular velocity w. 
The above exhaust the assumptions defining the rotating disk for the RIO X. 


?!These assumptions can be stated in a more advanced language. Here we state without further 
comment the geometric significance of each of the four assumptions in terms of the fluid of 
observers defined by the disc. 


a. The fluid defined by the disk has zero shear ogy = 0 

b. @ab;c = 0 

c. u* #0 

d. 6 = 0. The Born rigidity conditions ogp = 0, 6 = 0 are satisfied. 


7.10 The Rotating Disc 243 
7.10.5 The Locally Relativistic Inertial Observer (LRIO) 


We consider a point A along the rim of the disk and we consider the comoving 
LRIO Xy4 whose velocity (in X!) is v = wRj where j is the unit vector tangent to 
the rim of the disc. As we did in Sect.’7.9 we choose the coordinates of 2, 24 so 
that j is also the unit vector along the y—axis (see Fig. 7.15). We consider a point 
B at the rim of the disk and write the components of the position four-vector, the 
four-velocity and the four-acceleration of B in 77: 


ctB . ye 
_ _ . dx' 7 
= Rd cos 0g) ie joc8 = Bey sin Oz (7.134) 
Rsin 6g dt Bcy cos OB 
: x 0 ' 
; vy 
a, = on _ | —By (7 sing + wy cos 6g) 
. dt By (7 cos Og — wy sing) 
0 


x 


These components are found by the observation of the disk in X. 
In order to determine the observation of the disk in X 4 we assume that in D4: 


tBLA CYB,A 
xi, =| 734 Ped pata (7.135) 
, YB,A : YUB,y 
0 SA 0 SA 


where yg 4 is the y—factor of the point B in Ly. Ly is related to © with a boost 
along the y—axis with 8 = wR/c. For the position four-vector the boost gives: 


Fig. 7.15 Observation of the 
disk by the LRIO in XY, the 
moment tg,4 = 0 


Xk 


Inertial Observer 


227 is time in D. 


244 7 Four-Acceleration 


ctp = y(ctp,a + ByB,A) (7.136) 
xp.A = —R(1 —cos6g) (7.137) 
yB,A = y(R sin 6g — Brg). (7.138) 


Replacing tg from (7.136) in (7.138) we get: 
1. 
yBA=R (= sin Og — orn.a) : (7.139) 
Replacing y,4,z from (7.139) in (7.136) it follows that: 
1 ; 
ctp = —ctp.a + BRsinOg (7.140) 
Y 


which relates the time coordinate of the point B in Y and YX, respectively. 
(Obviously all the above go through if we assume instead of a point B on the rim 
any other point at a distance r < R from the center of the rotating disk). 

From the above relations we compute: 


2 
me) =e (7.141) 


2 
(1 " a) : (overs. ee 


7.10.5.1 Observation of the Disk in ©, 


The observation of the disk in 4 is by definition the measurement of the position 
of the points of the disk, measured simultaneously in & 4. Thus we are interested 
in the position of the points B of the disk at one moment of X,4. Let us choose the 
moment fg,4 = 0. Then the observation of the disk in the LRIO 4 is the locus 
of the points B of the rim of the disk for which tg 4 = 0. The spatial coordinates 
of these points are found from (7.137) and (7.139) or better from (7.141) if we set 
tp,a = 0. We find: 


XB,A\2 VYB,A\2 
1 ( ) =, 7.142 
ee ee (7.142) 


This equation describes an ellipse centered at the point (—R,0) whose minor 
semi-axis is R/y (along the y—axis) and its major semi-axis is R (along the x —axis) 
(see Fig. 7.15). 

It is possible to relate the observation of the disk in X4 the moment tg, 4 = 0 
with the observation of the disk in ©. This is done as follows. From (7.140) we have 
that the moment tg 4 = 0 corresponds to the moment 


ctg = BRsin Oz (7.143) 


7.10 The Rotating Disc 245 


where 6g = wtg due to the rotation of the disc with constant angular velocity w in 
X. Therefore the corresponding observation of the disk in & the moment tg, 4 = 0 
of Xi is given by the position four vector 


ctB 
—R(1 —cosatpz) 
Rsinwtg 
0 


x5 (tp,a = 0) = (7.144) 


where tg is given by (7.143). Obviously the observed disk in & is still a rotating 
disk of radius R. 


7.10.5.2 Calculation of the Path of B in X,4 
Consider in & a point B on the rim of the disk the moment tg in X&. Then the 


coordinates of B in Xy4 are given by (7.136), (7.137), and (7.138) where 0g = otg. 
In order to find the path of B in X4 we replace tg in terms of tg, and find 


xB,A = —R(1— cos {wy (te,a + (B/c)yB,a))} (7.145) 


YB,A 


1 
R (< sin {wy (tg,a + (B/c)yB,a))} — otna), 


This equations define in the plane of the disk a cycloid as shown in Fig. 7.16. 
In order to compare this result with the corresponding Newtonian result we set 
y = 1, B = 0, tg,4 = tg (Newtonian absolute time) and find: 


XB ANewt. = —R(1 — cosatg) (7.146) 


YB, ANewt. = R (sin@tg — otg). 
These equations describe again in the plane of the disk a cycloid as shown in 
Fig. 7.16. 
To find the lowest point yg 4 jow of the cycloid along the y axis we set xg,4 = 0 


in the first (7.145) and find wy(tg,4 + (B/c)yp.a)) = 2m from which follows 
tha = x — (6/c)yp,a. Replacing this tg 4 in the second (7.145) we find 


YB.Alow = Y20R. (7.147) 


which is larger than the corresponding Newtonian result 27 R by the factor y. 


246 7 Four-Acceleration 


Fig. 7.16 The cycloid of the 


y y 
Newtonian observer and the 
LRIO D4 x xX 


Inertial Observer S Inertial Observer S,, 


7.10.5.3 Calculation of the Velocity of B in X4 


In order to calculate the velocity of B in &4 we use the Lorentz transformation of 
the four-velocity components given in (7.134). The four velocity of B in X,4 equals 
its relative velocity to the point A along the x—axis. In & we have that the four 


v v 
velocity of A is ra — ° and that of B i aa 3 . Therefore the 
vB By cos Op 
0/5 0 = 
relative 4-velocity of the point B in & is: 
¥ Y 0 
Z es —By sin Oz _ | 0 = —By sin Oz 
BATES! | By cos 6p vB —By (1 — cos 6g) 
0 z= O75 0 > 
The invariant uy AUB, Ai = —yp,a where yg., is the y—factor of B in Ly. 
Replacing we find: 
vB,A = y7(1 — B’ cos 6p). (7.148) 


This is an invariant therefore its value is the same for all RIO, including &!. 
We note that the point B’ of the disk opposite to the point B (i.e. 9g = 7) has 
y —factor: 


yea =y7(1+ B’) = 2y?-1. 


7.10 The Rotating Disc 247 


We consider now the boost relating ©, X14 applied to the four-velocity vector of 


VBA Y 
v« = . 
the point B. We have the components ae BA and By einge . For 
YB,AV, 4 By cos 6B 
0 DA 0 >>) 


the zeroth component we find: 
vB.A = v(y — BBy cosOp) = y*(1 — B cos 4p) 


which coincides (as expected) with the previous result (7.148). For the other two 
components we get: 


YB,AU, , = —By sin6p 
yB.Av. , = y (By cos Og — By) = By? (cos Og — 1). 
or, replacing yp. a: 


y Bysin6g B sing 

BA y2(1 — B2cosOg) ~—~y(1 — B2 cos 4p) 
oe By? (cos 6g — 1) = B(cos 6g — 1) 

BA y?(1— B?cos6g) 1 — B? cos6g- 


It is possible to compute the components of the velocity of B in X4 by direct 
differentiation because is we know the components of the position four vector of B 
in &,. For example for the y - component we have 


. dyB.A Pe (t sin {wy (tg. + (B/c)yB,a))} — tp.) B(1 — cos 6p) 


UBA 


~ dtp.a dtpa 1 — 62 cos Oz 
(7.149) 

where 
6B = wy (tea + (B/C) yB,A)- (7.150) 


Similarly we compute the component UR At 


248 7 Four-Acceleration 


7.10.5.4 Calculation of the Angle 03,4 of the Radius OB in 14 


We have??: 


i 8 
_ YB YOA _ A (2 sinyOB — wtp.a) + Rots.a 4 
tanOp.4 = = = — tang. 


XB,A—XO,A —R(1 —cos0g)+R (7 152) 


From this we calculate the angular velocity of the radius OB in Xa (the angular 
velocity w is in &!). We have 


y 
dOB.A >, dtanOg a cos” OBA UBRA 
= Za 6 “a “11 — Jo. 7.153 
one dtpia he dtp.a cos? Oz es Cc . ( ) 


Replacing Vp 4 We find after standard calculations 


(1+ y? tan? Opa) cos’ 6B 4 
y? [Lv 1+ y? tan? 6g,4 + p?| 


where the minus sign is used for 0 < 6g.4 < m/2, 30/2 < Op,4 < 2m and 
the plus sign for 7/2 < 63,4 < 3m/2. The angular velocity wg,4 has maximal 


OBA = o. (7.154) 


_ p2 
value WB Amax = yiph for 63,4 = O and minimal value wg 4 min = rate for 
63,4 = 1. Generally the angular velocity wg 4 is greater when xg 4 > —R than 
when xp 4 < —R. 


7.10.6 The Accelerated Observer 


Let © be the accelerated (rotating) observer who rotates with the disk. For 
this observer we must define a coordinate transformation, which will relate its 
coordinates with those of the RIO &. The definition considered by Grén (we 
emphasize that this is not the sole choice!) consists of the following assumptions: 


23 Another way to compute the angle 6, 4 is to use the transformation of the angles for boosts 
which is given by the following formulae 


sin OBA cos OB A 


7 cos6z = . 
J 1 — B2 cos? Og 4 yJ/1— B? cos? 0g 4 


sinOp = (7.151) 


7.10 The Rotating Disc 249 


a. The disk remains a plane circular disk as it rotates. This means that the rotating 
observer can use the polar coordinates (7, 6, Z) where (7, 6) are the polar and the 
angular coordinate in the rotating plane (x, ), which rotates wrt the plane (x, y). 

b. The radius of the rotating disk is the same with the radius of the disk at rest, that 
is R. This implies for the transformation of the radial coordinate: 


r=r. 


c. The clocks of observer © are synchronized with the clocks of & with spherical 
optical waves which are emitted from the center of rotation of the disk. This 
implies that the coordinate time in the two frames is related as follows: 


t=t 


d. For the accelerated observer the rotation is “rigid” in the sense that points on a 
radius remain on that radius during the rotation. This implies that the angles on 
the disk for the accelerated observer are related to the angles on the disk for the 
inertial observer & by the formula: 


6=6—ot. 


d. For the remaining coordinate z normal to the plane of the disk we assume that 
there is no change and write: 


NX 
Il 
NX 


In conclusion the coordinate system (f, 7, 6 , Z) of the rotating observer is related 
to the coordinate system (t, 7,0, z) of the RIO & by the transformation equations: 


i=t,F =r, 6=0-oat, Z=z. (7.155) 


Having the coordinate transformation we compute the metric for the accelerated 
observer by requiring that ds? = ds*. We compute: 


d3? = —dt? + dr? + 1r°d0* +.dz’ 
= -di? + di? +7 (dé + wdi)? +d? => 
d3? = —(1 — w*?*)d?* + d?? +. 7d? + 2u7dbdi. (7.156) 


This gives the following metric components in the rotating coordinate system: 


—(1— 7?) 0 wf 0 


* 10 O 
= TA5S7 
[gab] ’ ep? 0 (7.157) 
* * ok 1 


250 7  Four-Acceleration 


Using the components of the metric and relations (7.126) and (7.128) one is able to 
compare the kinematics of the accelerated observed & with that of the RIO X. 


7.10.6.1 Time Intervals 


Let tT be the proper time of the accelerating observer. Then (7.127) gives: 


dt = \/|gooldt = 1 = F?2/cdr (wF <c). (7.158) 


The same result we compute if we set d7 = dO = 0 in the expression of the 
metric and then use ds* = —c*dt?. 

Kinematically this means that all proper clocks of the rotating observer go slower 
when compared to the clock of the RIO** © and with a rate increasing with the 
position 7 according to (7.158). Thus events at different distances from the center of 
the disk, measured as simultaneous on the coordinate clocks, are not simultaneous 
for X. 

It is to be noted that the clocks in © do not agree with the clocks in X, (dt # 
dt,). This is due to the synchronization procedure followed between the clocks of 
D, & (spherical light waves from the center of rotation at fixed intervals) and that 
between the clocks of &, X, (Einstein synchronization). 


7.10.6.2 Spatial Geometry 


The positive definite line element, which defines the intrinsic spatial geometry of 
the accelerating observer is: 


do* = hyydx"dx” = |guv — 80n8ov/gooldx" dx”. (7159) 


From the components of the metric for the rotating observer we compute 
(see (7.157)): 


do* = d7* + Ty. (7.160) 
@ 


It follows that the line element along the radial coordinate is (set dé = 0): 


do, = dr (7.161) 


*4This is another view of the twin paradox. The first tween is aging with the clock of = and 
the traveling twin with the proper clock of the rotating disk. Furthermore if we have two twins 
positioned at different distances from the center of the rotating disk the twin nearer to the center of 
rotation center ages quicker than the twin further apart. 


7.10 The Rotating Disc 251 


and the tangential line element along the 6—direction is (set d7 = 0) : 


r 7 


dop = —————d. (7.162) 
0 V1 oF? 


The fact that the proper spatial length depends on 7 shows that the disk cannot 
pass from rest to rotation in such a way that both the radial and the tangential line 
elements remain unchanged. 

In order to compute the length of the periphery of the disk in © we take in (7.162) 
dO = 2n and? = R and get: 


Soe aes (7.163) 


where S is the length of the periphery of the disk for the RIO &. The difference is 
due to the fact that the unit of spatial length for the observer & changes along the 
periphery of the disk. 


7.10.6.3 The Velocity of Light for the Accelerating Observer z 


From (7.159) we have that the line element along the coordinate direction / is found 
by setting u =v =1: 


doy = (gu — goigo1/goo)dx'. 


The velocity of a particle in © along the direction / is (we do not take c = 1 in 
order to show the differentiation from the value c): 


do; dx! 
= ge V (gi — 801801/800) 79° (7.164) 


The line element between two points with coordinates (x9, x!) and (x9 + 
' : ; ie 
dx°, x! + dx') (which are related with velocity #5) is: 


ds* = gy (dx!) + 2gi9dx'dx° + goo(dx°)’. 
If these points are along the trajectory of a light ray, then ds? = 0 and we obtain: 


gu(dx')* + 2giodx!dx° + goo(dx°)* = 0. 


252 7 Four-Acceleration 
Dividing with (dx°)? we find: 


dx! 800 
dx9 


lg7, — gugool + gi4 


Substituting in (7.164) we obtain the velocity of light in the direction / : 


c= aa, = Vlgu — gorgor/ goo! = = 600 c 
~~ dx9 ~ ~~ 810 1° 
* y |8io — gugool + g10 ifaw + 


(7.165) 
Let us see what we get for the two characteristic coordinate directions. Along the 
radial direction r we have go, = 0, goo = —(1 — wF), 8rr = | therefore: 
ce =V1—0%7c < c. 
For the tangential velocity of light we have gog = or, goo = —U - wr), 


800 = 72. Hence: 


V1— 7? 1 — @??? 1-or 
co = 7 c= ——c= =e <. 
cor +1 1l+or 1+or 


|wr*—F?[-(1—*7?) | 


Exercise 7.10.1 Compute the spatial length of the velocity of light that is the 
quantity ic + ree and compare it with c. 


The reason that the velocity of light is locally different from c is due to the fact 
that the clocks of the rotating disk are synchronized with light signals from the 
center of the disk, whereas the clocks of the RIO & are synchronized by the Einstein 
convention (i.e. the Lorentzian kinematics). 


Exercise 7.10.2 Show that the time At, required by a light ray to reach from the 
center of the disk to the periphery of a circle of radius r is given by: 


1” di 1 wr? 


= — arcsin ; (7.166) 
CJo Jl—a@7/c2 @ c 


At, = 


7.11 The Generalization of Lorentz Transformation and the Accelerated... 253 


7.11 The Generalization of Lorentz Transformation 
and the Accelerated Observers 


The Lorentz transformation contains all the mathematical structure of Special 
Relativity. Indeed being linear relates the characteristic observers of the theory (the 
RIO) and, by means of the Lorentz group, defines the Lorentz tensors, which are 
the mathematical objects describing the physical quantities of the theory. Therefore 
within the framework setup by the Lorentz transformation it is possible to study 
all problems involving relativistic inertial motions. However in practice the rule 
is accelerated motions whereas inertial motions are the exception. Therefore it is 
natural to ask: 


Is it possible to study accelerated motions by means of a (generalized) Lorentz transforma- 
tion? 


One answer to this question has been given in Sect. 7.2 by the introduction of the 
Locally Relativistic Inertial Observers (LRIO). In that approach one approximates 
the worldline of an accelerated observer (which of course is not a straight line) 
with a continuous sequence of straight lines each line being the tangent at each 
and every point of the worldline. This continuous sequence of straight lines is 
parameterized by the proper time t (say) of the accelerated observer and can be 
considered equivalently as a continuous sequence of Lorentz transformations whose 
parameter 6 (tT) is again a function of the proper time t. This approach has been used 
in the study of the Thomas phenomenon. 

In the last sections we considered a different approach, that is, we induced 
a metric for the accelerated observer using the metric of the inertial observer 
n = diag(—1, 1, 1, 1) and the transformation relating the coordinates of the two 
observers. Then we were able to calculate temporal and spatial distances for the 
accelerated observer. In the present section we wish to take (7.127) and (7.130) one 
step further and answer the question: 

Is it possible to determine a transformation which relates a RIO with a given 
accelerated observer in the same way the Lorentz transformation relates a RIO with 
a RIO? 

As we shall show this is possible, and the resulting transformation — which is not 
universal as the Lorentz transformation, but depends on the particular accelerated 
observer — we call generalized Lorentz transformation. We shall consider only 
simple configurations, because in more general situations one has to go directly to 
General Relativity. This “generalization” of the Lorentz transformation will take 
us to the limits of Special Relativity and furthermore, indicates the direction one 
should take in order to extend the theory. Indeed, as will be shown, the extension 
— generalization of Special Relativity is not of a simple academic interest but it 
is a necessity emerging from physical reality. That is to say, there are relativistic 
phenomena which cannot be answered within the scenario of Special Relativity, 
because they are not due to relative motion. 


254 7 Four-Acceleration 
7.11.1 The Generalized Lorentz Transformation 


We seek a generalization of Lorentz transformation which will meet the following 
demands: 


1. The assumptions which will be made shall be minimal 
2. In the limit, when the acceleration vanishes the “generalized” Lorentz transfor- 
mation will reduce to the standard Lorentz transformation. 


We recall that the standard Lorentz transformation has the following basic 
features: 


1. It is an isometry, that is preserves the Lorentz length ds* = ds’. 
2. It is linear 
3. Preserves the canonical form of the Lorentz metric diag(—1, 1, 1, 1). 


The generalization we are looking for must defy some of these assumptions. But 
which one(s)? 

The assumption of isometry is essential because it concerns fundamental physical 
principles of the theory. For example the particles are classified in photons and 
particles with mass according to the (Lorentz) length of the momentum or position 
four-vector. Therefore a transformation which will not preserve the length of four- 
vectors it is possible to change the nature of a particle. This would imply that, 
by means of a transformation, a photon could become a particle with mass and 
conversely, a situation which cannot be accepted. 

There remain the other two assumptions. The property of linearity must be defied, 
because the worldline of an accelerated observer is not a straight line and a linear 
transformation cannot relate a straight line with a non-straight line. Concerning 
the last property this must also be abandoned because the Lorentz metric for an 
accelerated observer cannot have the canonical form. 

Based on the above remarks we demand that the generalized Lorentz transforma- 
tion must satisfy the following requirements: 


* Must be an isometry, that is ds? = ds’ 

e Will not be linear except when acceleration vanishes 

e Will apply in general only locally, that is around every point of the worldline 
of the (specific) accelerated observer and not globally (i.e. in the whole of 
Minkowski space) as it is the case with the standard Lorentz transformation. 
Furthermore in general the set of all “generalized” Lorentz transformations 
associated with an accelerated motion cannot be a subgroup of the Lorentz group 
and need not form a group. 


Consider a RIO = with coordinates (/,x, y,z) and let (l’, x’, y’,z’) be the 
coordinates of an accelerated observer EF. The distance of two “adjacent” points 


7.11 The Generalization of Lorentz Transformation and the Accelerated... 255 


in Minkowski space for ¥ is*>: 


ds? = —dl* + dx* + dy* +. dz’ (7.167) 


while for E let us assume that it has the general form”®: 


ds” = —updl” + utdx” + uddy” + ugdz” (7.168) 


where uo, U1, U2, U3 are real, smooth functions of the coordinates (I’, x’, y’, z’). The 
demand of isometry gives: 


— dl* + dx? + dy? + dz? = —updl? + utdx? + ubdy? + uddz*. (7.169) 


In order to simplify the mathematics we consider motion along the x—direction 
only and set y = y’,z = z’. Then the transformation we are looking for is of the 
form: 


x=x(l',x'),l=1(',x') (7.170) 
where the functions are such that the transformation will be non-singular, that is, it 


will be reversible. The condition for this is that the Jacobian shall be non-zero. We 
compute: 


ax, a 

dx = al 4 “ax! =a, x')al’ + dW, x’)dx' 
al! ax! 
al fi 5 

dl = —dl' + —dx =c(l',x')dl'+ dl, x’)dx' 
al! ax! 


from which follows: 


—(cdl' + ddx')* + (adl' + bdx')* 
(—c? + a*)dl’ + (b? — d?)dx” + 2(ab — cd)dl'dx’. 


ds? = dl’ dx 


But: 


ds” = —u2dl? + urdx” (7.171) 


>>The concept “adjacent” is outside the scope of this book and requires more advanced mathe- 
matics (topology). We simply mention that “adjacent” is modulated according to the curvature (= 
measure of acceleration) at each point of the worldline of the accelerated observer. 

6This form is general, because it can be shown that every (non-singular) metric can be put in this 
form by an appropriate choice of the coordinate system. Note that the quantities uo, v1, u2, u3 are 
functions of the coordinates and not constants. 


256 7 Four-Acceleration 


hence isometry implies: 


e-@=ue 172) 
BP —d =u? (7.173) 
ab —cd =0. (7.174) 


Because the functions ug, wu; are real we demand | c |>| a |, | b |>| d |. From 
the continuity of the transformation we also have the conditions: 


02x 02x 

a ax" = ax al => a,x = dy (7.175) 
a71 a71 

alax’ axel oO (7.176) 


“e9 


where “,” indicates partial derivative wrt the coordinate(s) that follows. Relations 
(7.172), (7.173), (7.174), (7.175), and (7.176) are the relations among the coeffi- 
cients a,b, c,d of the required transformation and the components uo, u; of the 
metric. 

We note that if we demand the transformation to be linear the coefficients 
a,b,c, d must be constants. In this case from (7.172), (7.173), and (7.174) follows 
that the coefficients ug, wu; are also constants, hence if we consider the change of 
units x’ = u ,x’,l'! > uol’ the metric takes its canonical form diag(—1, 1, 1, 1) 
and we return to the standard Lorentz transformation. We infer that in order to have 
a generalized Lorentz transformation the coefficients a, b, c, d or, equivalently, the 
functions uo(/’, x’), u; (1, x’) must be non-constants. 

Obviously the general solution of the system of equations (7.172), (7.173), 
(7.174), (7.175), and (7.176) is difficult to find — and perhaps it is of no interest- 
therefore we are looking for special solutions, which have a profound kinematic 
and/or geometric significance. 


7.11.2 The Special Case ug(l’, x’) = uy, (l', x’) = u(x’) 


We consider the accelerated (one dimensional) motion which is defined by the 
conditions: 


ug(l', x’) = uy (l', x’) = u(x’) (7.177) 


7.11 The Generalization of Lorentz Transformation and the Accelerated... 257 


where u(x’) is a real smooth function. We note that for this choice the metric for 
the accelerated observer is conformal?’ to the canonical form of the Lorentz metric 
diag(—1, 1,1, 1): 

ds’? = u*(x')(—dl? + dx’). 


In this particular case relations (7.172), (7.173), (7.174), (7.175), and (7.176) read: 


C-av =u" (7.178) 
P-d@=w (7.179) 
ab—cd =0 (7.180) 
Ax! = byy (7.181) 
Cx = dap (7.182) 


with the restriction | c |>| a |, | b |>| d |. From (7.178) and (7.179) follows: 
CHP HP -P ae 4+P HC +a" 
and from (7.180): 
ab=cd. 
Adding and subtracting we find: 


|a+b|=|ct+d|,|a—b|=|c—d|]. 


These relations are equivalent to the following four systems of algebraic equations: 


a+b=c+d anda—b=c-—d>a=c,b=d not-accepted 
a+b=c+d and a—b=-c+d>a=d,b=c accepted 


a+b=-c—d anda—b=c-—d>a=-—d,b=-c accepted 
a+b=-—c—d and a—b=-c+d>a=-c,b=-—d not-accepted. 


We conclude that the solution is: 


a=e\d, b=e\c (7.183) 


27Two metrics gi(x'), go(x') are said to be conformal if there exists a smooth function ¢ (x!) 
such that go (x!) = (x!) Zl (x'). The conformally related metrics have important applications in 
Physics and especially in electromagnetism and General Relativity. A metric which is conformal 
to the Lorentz metric it is called conformally flat. 


258 7  Four-Acceleration 


where €; = +1. 
We differentiate (7.179) wrt l’ and find (note u(x’)!): 


bb,y —dd,y = 0. 


We continue with relations (7.181) and (7.182). From (7.181) and (7.182) we have: 


Cc a 1 
Ca,y/ —daC,y' = cby ady = Roe qie = : (cd ab) di =0 
from which follows: 


(5). =035a= O(c (7.184) 


where ®(/’) is a smooth function. Replacing a in (7.178) we find («2 = +1): 


€2u 
c= ——.,| O|< 1. 


V1 — 2 


and from (7.183) follows: 
d=e (Jc. 
Finally the solution is: 


€2u® 
V/1— &2 
€1€2uU 


fl — O2 


This solution is determined in terms of the functions u(x’), ®(l’) (| ® |< 1), 
which have to be determined. In order to do that we consider (7.181) from which 
follows: 


b=eE\c= (7.185) 


® €,u®®,; U,x! €1 9, 
a,y = by > — u i ——4 — 


VI-@? "(1 2)? 7 uw 1 


The lhs is a function of x’ and the rhs is a function of /’. Therefore each side of 
the equation must equal a constant p, say: 


ane (7.186) 


=p. (7.187) 


7.11 The Generalization of Lorentz Transformation and the Accelerated... 259 
Integration of the first gives: 
u(x ) = Ae?’ , A = constant (7.188) 


and integration of the second: 


1+ 


<a e%1Pl+2B _s @ (I) = tanh(e, pl’ + B), B=constant. (7.189) 


Eventually we have the solution: 


tanh(e, pl’ + B) 


a(l’, x’) = eid(l , x) = €Ae?™ = €)Ae?* sinh(e; pl’ + B) 


1 — tanh?(e; pl’ + B) 
bil’, x’) = eyc(I’, x’) = €1e,Ae”™ cosh(e1 pl’ + B). 


In order to compute the constant B we assume the initial condition: The motion 
starts from rest. Then a (0) = a (0) = 0 and follows: 


; ‘ 0) + (0) 4-0 
ax ee es _ aA Dar | ) >a(0)=05 B=0. 
dijo cdl + dO)dx'o  c(0) + d(0) 4-0) 


This implies the final expression for the coordinates of the metric: 
a(l’, x’) = ed(I', x’) = eAe?™ sinh(€ pl’) (7.190) 
bU, x) =ecU, x) = €1€2Ae?™ cosh(e} pl’). (7.191) 


Having the expression of the coordinates of the metric we are in a position 
to determine the “generalized” Lorentz transformation for the particular case we 
discuss (not for every accelerated motion!). We find: 


A / A y 
x= [eat roa =aea~ f ater cosh(€, pl')] = €y€2—e?* cosh(e pl')+ By 
P P 
(7.192) 


and similarly: 


Besat 
1 = €,€2—e?” sinh(€, pl’) + Bo. (7.193) 
P 


We determine the constants B,, Bz by demanding the initial conditions: / = I! = 
0, x’ = 0,x = k, that is, the motion starts from the point x = k of = and with 


260 7 Four-Acceleration 


the clocks synchronized at the origin of the coordinates. We compute easily By = 
—€164 +k, Bo = 0, hence: 


A xo os ’ 

1 = €yeo—e?* sinh(e pl’) (7.194) 
P 
A px! ! 

x = €1€2—(e?* cosh(e, pl’) — 1) +k. (7.195) 
Pp 


It remains to compute the constant A which concerns initial conditions for u. We 
require that when x’ = 0 then dx’ = O, that is, at the origin of E the coordinate 
l' = t where T is the proper time of the accelerated observer. This demand restricts 
the motion of the origin of the accelerated frame to be the hyperbolic motion we 
studied in Sect. 7.4. 

For dx’ = 0 we find (c = 1): 


ds’ = —u3(0)di”. 


But we also have that ds* = —dt? so that from the isometry condition ds* = ds'” 
follows: 

— dt? =—-wW(0)dt? > ui(0) =1 > A=63 (7.196) 
where €3 = +1. We conclude that the generalized Lorentz transformation for this 


type of accelerated motion is: 
1 a ! 1 px! ! 
l=e6;—e?* sinh pl’, x = E1263— (e” cosh pl! — 1) +k. (7.197) 
P Pp 
Indeed we note that when x’ = 0 relations (7.197) read: 
1, j 1 j 
I] = €\€2€3— sinh pl x = €1€2€3— (cosh pl’ — 1) (7.198) 
Pp P 


which are identical with the corresponding relations of hyperbolic motion if we set 
l’ = t and eje0€63 = 1. 

It is interesting to compute the inverse transformation x’ = x’(x, 1), I! =I'(x, 1). 
This is achieved as follows. We define the constant p = a ae and use (7.197) to 
find: 


Ip ae “{ Ip | 
"Ps tanh pi > I! = — tanh! | ——-?_ —_] 7.199 
(«—kp+l P ? wel) 


—kp+l 


7.11 The Generalization of Lorentz Transformation and the Accelerated... 261 


Replacing in the first of (7.197) we get: 


oe gle pl 


= — => 
sinh pl’ P sinh tanh! 


r (7.200) 
(—k)p+l 


Let us examine the equations of transformation in the limit px’ < 0, that is very 


, ie) syn 
close to the origin of the accelerated frame. In this case e?* = xy = 1+ 
n=0 
px' = 1 and the equations of the transformation (7.198) are simplified as follows: 


1 

| = —sinh pl’ (7.201) 
P 
1 / 

x = —(cosh pl! — 1) +k. (7.202) 
P 


We infer that very close to the origin of the accelerated frame the equations 
of transformation tend to the equations of hyperbolic motion, a result which was 
expected. 


7.11.2.1 Conclusion 


It is possible to find a non-linear isometry which relates the coordinates (/, x) of a 
RIO with the coordinates of an accelerated observer E (I’, x’). Therefore we arrive 
at the important conclusion that: 


The accelerated motion can be “absorbed” into the generalization of the metric, that is the 
Geometry. 


Although we proved this for the simple case we considered above, there is no 
reason why we should not assume the validity of this conclusion in general. This 
conclusion is of fundamental importance for the following reason. According to 
the Principle of Equivalence the massive bodies (and, as we shall see, the photons) 
suffer the same acceleration in a gravitational field independently of their internal 
structure. Consider a RIO & in which there exists a gravitational field whose 
strength is g. Then according to the Equivalence Principle all particles with mass 
(and the photons), when move freely in X, attain acceleration g. According to the 
generalization we have considered it is possible to alter the metric diag(—1, 1, 1, 1) 
to a new metric, which will be determined from the form of the gravitational field. 
Therefore: 


The gravitational field is “absorbed” into Geometry (i.e. the metric) of spacetime (not 
Minkowski space anymore) and in the new metric the effect of the gravity does not exist. 


This type of motion is called free fall. It exists in Newtonian Physics. However it 
is not attributed to the change of the background geometry of (absolute) space and 
one simply ignores the gravitational field as a force in Newton’s Second Law. The 


262 7  Four-Acceleration 


geometrization of the gravitational field started with Special Relativity and finally 
found its natural place in General Relativity, which is also a geometric theory of 
Physics. 


7.11.3 Equation of Motion in a Gravitational Field 


An important application of the generalized Lorentz transformation, which takes 
us very close to General Relativity, is the Principle of Least Action for accelerated 
motions. 

Let A, B two events along the worldline of a ReMaP P. If P moves inertially 
then the worldline of P is a straight line in Minkowski space and it is defined by the 


requirement that the integral: 
B 
/ y |ds?| 
A 


is extremum wrt the metric n = diag(—1, 1, 1, 1). This requirement is equivalent 
to the demand that the variation of the action integral vanishes (Principle of Least 


Action): 
B Vlas2i 
3 | |\ds?| = 0. 
A 


Because for an accelerated motion we have assumed that the metric remains the 
same, that is ds? = ds'?, whereas the components change we stipulate that the 
worldline of an accelerated relativistic mass point is given by the condition: 


B 
af |ds’2| =0 (7.203) 
A 


that is, it is a geodesic (curve of extremum length defined by the points A, B) wrt 
the metric g;;. Therefore the straight lines of an inertial motion become geodesics of 
a proper Riemannian metric of Lorentzian character. In case the acceleration is due 
to gravitational forces only, equation (7.203) defines the worldline of a relativistic 
particle in free fall in the gravitational field. 

We conclude that the effect of accelerated motion in Special Relativity is 
twofold: 


e “Changes” the Lorentz metric from its canonical form 7 = diag(—1, 1, 1, 1) to 
another general form g;;. The theory does not say how this new metric is to be 
defined and one if free to do it as one considers best. General Relativity gives 
field equations for the determination of this metric. 


7.12 The Limits of Special Relativity 263 


e The worldline of a relativistic particle in free motion is a (timelike) straight line 
in Minkowski space when the gravitational field vanishes and (timelike) geodesic 
of the Riemannian space with metric 9;; when there exists a gravitational field. 


The above conclusions lead us to the following generalization of the Equivalence 
Principle of Newtonian gravity in General Relativity (as we shall see Special 
Relativity cannot accommodate the gravitational field): 


Accelerated motions which are caused by the gravitational field only (free fall) take place 
along geodesics of the metric which corresponds to the particular gravitational field.78 


7.12 The Limits of Special Relativity 


In Sect. 7.11 we showed that accelerated motions, and consequently the gravitational 
field, can be absorbed in the geometry of spacetime without introducing new 
principles but the Principle of Equivalence, which is borrowed from Newtonian 
theory. Therefore it is logical to expect that with this differentiation — generalization 
Special Relativity can become a theory of gravity. The first attempts towards this 
direction were done by Einstein himself (and others followed). However it became 
clear that Special Relativity cannot accommodate the gravitational field. However 
it offers many of its elements for such a theory to be constructed. This road led 
eventually to the Theory of General Relativity. In the following we present some 
thought experiments which justify these remarks. The common characteristic of 
these “experiments” is that they consider photons interacting with the gravitational 
field and show that this interaction leads to new phenomena which cannot be 
explained within the framework of Special Relativity and require a new relativistic 
theory of gravitation. 


7.12.1 Experiment 1: The Gravitational Redshift 


Consider a positron and one electron which rest within a gravitational filed at the 
potential level V in some RIO X. At some moment the particles are left to fall freely 
in the gravitational field. At the potential level V+ AV (AV < 0) the two particles 
annihilate producing two photons of frequency v (why the two photons must have 
the same frequency?). Subsequently the photons are reflected elastically on a large 
mirror and return to the potential level V where they interact creating again a pair 


8This metric is computed (partially) from Einstein field equations. 


264 7 Four-Acceleration 


positron — electron which are left to fall again and so on. This thought experiment 
was suggested by Einstein.7° 

We discuss this thought experiment with the sole restriction that it does not give 
rise to a perpetual mobile that is, the Second Principle of Thermodynamics is not 
violated. Because the particles have equal masses, at the potential level V-+ AV they 
will have equal kinetic energies smv- =m | AV |. At the first event of annihilation 
of the particles conservation of energy implies for the frequency of the photons (In 
the following we assume c = 1): 


2hv = —2mAV + 2m. (7.204) 


Let v the frequency of the photons when they reach the potential level V. If the 
photons do not interact with the gravitational field their frequency at the potential 
level V will again be v, hence the produced pair of particles must have non-zero 
kinetic energy, which contradicts the Second Principle of Thermodynamics. There- 
fore the photons must interact with the gravitational field and more specifically they 
must loose energy (i.e. they must be redshifted) as they propagate to higher potential 
levels. In order to compute the amount of redshift we assume that the pair of electron 
— positron at the potential level V is produced at rest, therefore: 


2hv = 2m. (7.205) 


From relations (7.204) and (7.205) follows: 


vD— 


SAV 20. (7.206) 


cl 


The redshift of an electromagnetic wave is measured with the quantity z = _ 
or, in terms of the frequency z = Av, It follows that: 


z= AV. (7.207) 


The phenomenon of variation of the frequency of a photon as it propagates 
in a non-homogeneous gravitational field we call gravitational redshift. This 
phenomenon is neither Newtonian nor it can be explained within the framework 
of Special Relativity. Indeed the gravitational redshift has served as one of the first 
experimental facts towards the justification of General Relativity. It has been verified 


2°See A. Einstein “ Uber den Einfluss der Schwerkraft auf die Ausbreitung des Lichtes” (1911) 
Ann Phys, 898-908. Translation of this paper can be found in Lorentz H A, Einstein A, Minkowski 
Hand Weyl H (1923) “ The Principle of Relativity: A Collection of original Memoirs” Mathewen, 
London. Paperback reprint Dover (1952) New York. 


7.12 The Limits of Special Relativity 265 


by astronomical observations (with the shift of the line D; of the spectrum of Na in 
the solar spectrum*” as well as with relevant observations in the laboratory. 

Concerning the latter the best available measurements have been done by Pound 
and Rebka*! and Pound and Snider.*” In these experiments y radiation of 14.4 KeV 
produced from the radioactive source >’Co, was made to propagate opposite to the 
gravitational field into a tube of length 22,5 m filled up with Helium, which was 
placed along the tower of Harvard in the campus of the University of Harvard. At 
the reception of the photons it was placed an absorber enriched with >’ Fe, which 
was connected with a proportional counter. If the photons do not interact with the 
gravitational field the energy of the photons is proportional to the square of the speed 
u of the emitter (this velocity can change as we shall see below in this book when 
we study the relativistic reactions). If we consider various speeds of the emitter then 
the distribution of the number of photons in terms of speed must be a Gaussian 
symmetric around the value u = 0. 

In the experiment, the distribution of photons for two ranges of the velocity was 
measured and the existence of asymmetry was examined. The fist range was the 
velocities Vp + u and the second range the velocities —Vp + u. The asymmetry 
was observed and the calculations agreed with the value given by the gravitational 
redshift (that is z = —mgh). Therefore the gravitational redshift is indeed a physical 
phenomenon, which must satisfy every theory of gravity. One of these theories is 
General Relativity. 


7.12.2. Experiment 2: The Gravitational Time Dilation 


The fundamental conclusion of the gravitational redshift is that the photons interact 
with the gravitational field. If one considers the photon as an oscillator, hence as 
a clock, this implies that identical proper clocks placed at rest at different places 
within a non-homogeneous gravitational field have different rates! This is not 
conceivable within the framework of Special Relativity. To show the validity of 
this assertion we consider a second thought experiment proposed by A. Schild.*? 


30See Brault J W (1963) “Gravitational Redshift of Solar Lines” in Bull Amer Phys Soc 8, 28. 
31 See Pound R V and Rebka G A (1960) “Apparent weight of photons” Phys Rev 4, 337-341. 


32See Pound R V and Snider J L(1965) “Effect of gravity on gamma radiation” Phys Rev 

B140,788-803. 

33 

1. Schild A. in “Relativity Theory and Astrophysics I: Relativity and Cosmology” (1967) American 
Mathematical Society. 

2. Schild A. in “Proceedings International School of Physics Enrico Fermi” (1963) Academic 
Press NY, 69-115. 


266 7 Four-Acceleration 


A source of monochromatic electromagnetic radiation is placed at the potential 
level V + AV of a gravitational field and a receiver not moving wrt the source, at 
the potential level V. If the electromagnetic wave (photon) does not interact with the 
gravitational field the frequency of the wave at the emitter and the receiver must be 
the same. However as we have shown the electromagnetic field does interact with 
the gravitational field, therefore the two frequencies must be different i.e. »v 4 Vv. 
Suppose that the frequency of the wave at the emitter is v and the frequency at 
the receiver is D. Let t be the proper time at the emitter and T the proper time at 
the receiver. Assume that the emitter sends photons for a period t and assume that 
these photons are received at the receiver during a period T. Because the number of 
photons (oscillations) must be the same at the emitter and the receiver, we must have 
vt = v T. But v ¥ v therefore t 4 tT. However according to Special Relativity 
the emitter and the receiver do not move relative to each other, neither within the 
gravitational field, therefore the indications of the proper clocks at the emitter and 
the receiver once identical (synchronized with the Einstein synchronization) should 
stay so i.e. T = T. We conclude that: 


1. Special Relativity cannot be used in the study of gravitational phenomena except 
in the cases of very small regions and for weak gravitational fields, which can 
be treated practically as homogeneous, where the gravitational redshift can be 
neglected. 

2. The rate of a clock depends on the strength of the gravitational field at the 
point where the clock is situated. This implies that the proper time of relativistic 
observers which rest at different potential levels in a gravitational field it is not 
the same. More specifically the rate of the clock of the observer at the lower 
potential level is higher than the rate of the clock at the higher potential level. 
This results in a time dilation effect between observers at different potential 
levels. This new phenomenon we call gravitational time dilation. In order to 
compute the amount of the gravitational time dilation we consider the relation 
vt = VT and obtain: 


= 5 (7.208) 


Tl 
al 


Replacing the lhs from (7.206) which gives the gravitational redshift we find: 


= AV. (7.209) 


From (7.209) it is apparent that the rate of a clock depends on the strength of the 
gravitational field at the position of the clock. 


7.12 The Limits of Special Relativity 267 
7.12.3 Experiment 3: The Curvature of Spacetime 


The phenomenon of the gravitational time dilation prohibits the use of Special 
Relativity but also does indicates the course one should take to “extend” this 
theory. This direction is the one we have already chosen in the generalization of 
the Lorentz transformation. Indeed the generalization of Lorentz transformation 
through the coordinates of a RIO and an accelerating observer, leads directly to 
the generalization of the Newtonian Principle of Equivalence to incorporate all 
physical systems (including photons which do not exist — in the sense that they 
are not Newtonian physical systems — in Newtonian theory). The following thought 
experiment has been proposed** by A. Schild. 

Consider a spherical mass M which creates in the surrounding space a gravita- 
tional field and study the propagation of photons between an emitter on the surface 
of the sphere and a receiver (of negligible mass so that it has no effect on the 
gravitational field of M) near the surface of M. Let us assume that spacetime around 
the mass is still the Minkowski space. Let A be the event of emission of a photon 
of frequency v from the emitter the proper moment t, of the emitter and B the 
event of reception of the photon at the receiver, the proper time tg of the receiver. 
Because the emitter and the receiver do not move wrt the sphere their worldlines 
are parallel straight lines in Minkowski space. Due to the gravitational redshift the 
frequency V of the photon which reaches the receiver is v # v. Suppose we repeat 
the experiment the proper time t/, of the emitter and send a photon (event A’) which 
is obtained by the receiver (event B’) the proper time t, of the receiver. Because 
everything is static the frequency of the second received photon must also be v. This 
implies that the distance AB = A’B’ and due to the parallelism of the worldlines 
we infer that ABB’ A’ is a parallelogram in Minkowski space. But then AA’ = BB’ 
which contradicts the gravitational time dilation! 

The solution to this conflict is to assume that in the presence of gravity spacetime 
is not anymore the Minkowski space but a more general metric space, which in the 
small vicinity of any of its points (where the gravitational field can be considered 
homogeneous) can be approximated by Minkowski space. This new space has 
curvature and it is the substratum of General Relativity. 


34See reference in previous footnote. 


Chapter 8 Mm) 
Paradoxes a 


8.1 Introduction 


As it has been remarked repeatedly in the previous chapters, the theory of Special 
Relativity is not based on direct sensory experience, as it is the case with Newtonian 
Physics. This leads to situations which contradict the “common sense”! in the way 
that the theory gives different results depending on the Newtonian observer (as we 
are) who observes a given phenomenon. It is a fundamental hypothesis of Physics, 
which makes Physics a unique science, that “reality” is observer independent. 
However one must use the appropriate class of observers for each “reality”. 

As it is expected, after the introduction of Special Relativity and indeed after 
its early success, many people (a natural reaction to the new) posed several 
“hypothetical” experiments with the purpose to prove the confrontation of the 
“new” theory with the common Newtonian sense, hence its invalidation. Most of 
these suggestions involve the length contraction and the time dilation, because both 
(Euclidian) space and (Newtonian) time are fundamental to our perception of the 
world. All these “experiments” with one name are referred as “paradoxes”. 

All paradoxes concern Newtonian physical situations (“common sense”), which 
are transferred over to Special Relativity without checking if this is possible and 
in what way. All paradoxes can be explained if one considers the problem in the 
relativistic “common sense”, that is, using relativistic observers. We believe that 
whatever paradoxes will be proposed they will be explained by Special Relativity. 
Because if Special Relativity had even a small “bag” it is rather unreal that this 


'Binstein in order to show the insufficiency of the “common sense” in various projections of 
the human mind has remarked that “Common sense is the aggregate of prejudices acquired 
by the age of eighteen”. However many years before him Heraclitus has remarked that 
“Tov hoyov d€ovto0g Evvov Cwovalv ol WohAOL wes WdLav ExovtEs Ppovno.w” which means 
“Although common sense is common, most people consider it as if it is their own”. 


© Springer Nature Switzerland AG 2019 269 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_8 


270 8 Paradoxes 


has not yet been shown after the everyday routine application of the theory in the 
laboratory and the real world (atomic bomb!) for more than 100 years. 

However, paradoxes are useful because they help us understand better the theory 
and it is possible that they reveal facets of the theory that we have not noticed yet. 
A similar case is the contribution of Einstein to Quantum Theory. Indeed Einstein 
was very sceptical about Quantum Theory (“The God does not play dices”) and 
proposed many paradoxes and arguments in order to disprove that theory. The 
search for answers to these arguments greatly helped Quantum Mechanics, to the 
degree that later these arguments were considered to be Einstein’s contribution in 
the development of Quantum Mechanics. 

In the following we discuss the well known tween paradox as well as more 
advanced and relatively unknown paradoxes. 


8.2. Various Paradoxes 


In this section we discuss a number of paradoxes which concern the length 
contraction, the time dilation and other Newtonian physical quantities. 


Example 8.2.1 In Fig. 8.1 the H-shaped slider CDE Z slides with constant velocity 
Bc along the ends of the circuit &. We assume that the current can flow only along 
the sides CE and DZ and that the distance of the points C D equals the distance of 
the points AB when the slider CDEZ rests in the circuit X. Let X’ the proper 
observer of the slider. According to & as the slider moves the distance of the 
points is shortened due to length contraction; therefore the lamp will be off for a 
period of time T,¢¢ (say). According to the proper observer )’ of the slider for the 
same reason the distance of the points AB is shorter than that of the points CD, 
therefore the lamp will be always on. Prove that the two views, however radical can 
be explained hence do not lead to a paradox. 


Solution 
Fig. 8.1 The paradox of the 
lamp © B 


8.2 Various Paradoxes 271 


Before we consider the solution of the paradox” we note that when we close a 
switch the current develops in the conductor with speed 6,c where £,c is the speed 
of propagation of the electromagnetic field in the conductor. This implies that for an 
interruption of the development of a current in a conductor there is a “dead” time. 
We use this effect in order to explain the paradox. 

The circuit is open when the following events take place: 


a. The point C leaves contact A 
b. The electric pulse which is created when the point B comes in contact with point 
D travels along the part AB of the circuit. 


Assuming that the lamp is at the end A of the conductor and that AB = CD = I 
we have for each of the observers the following time intervals concerning the events 
a. and b.: 

Observer & 

The time period that the circuit stays “off” due to dissociation of the points A, C 

(Lorentz contraction, event a.) equals: 


| 
I, — 0 
oT 


Bec 


The time required by the current to develop in the circuit after point D closes the 
circuit at the point B (event b.) is: 


AB lo 


Bsc ~ pe 


Therefore for & the time period for which the lamp is off equals: 


b-® fH fl 1 i; 
Toff = F =~ 8.1 
tf - —Be Tt Bee (; ™ x) Bye a 


Observer ’ 
Assume that the observer &’ counts time zero when the end D coincides with the 
point B. The moment that the end point C disassociates from the end point A is* 


?For a different approach see G. Sastry (1987) ‘Is length contraction really paradoxical?’ Am. J. 
Phys. 55, pp 943-46. 


3We note that t, > t; therefore it is possible that for observer X’ the circuit remains open therefore 
the lamp will be turned off. The proof that t; > 1; has as follows. It is enough to show that: 
vyBlo _'o lo y_1 1 4 


vlo 
ee a be he” ke 


which is true. 


272 8 Paradoxes 
(event a.): 


Difference of length AB in D’ and in © _ lo — io 
Be ~ pe 


/ 
i= 


Assuming that the speed of the pulse for observer &’ is 6c the distance AB, 
which for &’ has length fo and speed —£c, is covered in the interval (event b.): 


! lo 
b= — 
v (By — Bye 


(8) — B)c is the speed of propagation of the pulse in the “moving” circuit AB. 
The time period for which the circuit stays off for D’ is: 


! ! / lo lo _ (lo/y) 
T = — 
off ~2~ = Br Bye Be 


From the relativistic rule of composition of 3-velocities we have: 


a Bs + B 
> 1+ £BsB 


Bs 
y7(1 + BsB) 


B => B,- p= 


Replacing we calculate 7’ tf 


ylo loB — lo lo 
Tepes + + =Y Tarr 
off ~ Be TY Bet ype 1 ef 


This relation proves that if the lamp turns off for © then it does so for X’ for time 
periods related by the time dilation formula: 


Topp = V Toff: 
Example 8.2.2. A source of monochromatic light is moving in the LCF & with 
constant velocity uv in the plane x — z parallel to the x—axis at a height h, along 
the z—axis. A wall of negligible thickness and of height hy (iw < hs) is placed 
parallel to the plane y — z and in front of the light source. Due to the presence of the 
wall a shadow is created in the rear part of the wall along the x— axis (see Fig. 8.2). 
From Newtonian Physics we expect that: 


(a) The shadow will be continuous 

(b) As the source approaches the wall the length of the shadow on the x—axis will 
diminish and will vanish when the source is exactly over the wall (that is at the 
z—axis according to Fig. 8.2). 

(c) First the most distant points from the wall will be lighted. 


8.2 Various Paradoxes 273 


Fig. 8.2. The shadow paradox 


Show that in Special Relativity the behavior of the shadow is different than the 
Newtonian one, if the velocity u is such that 4 > 1 — re Specifically, show that 
the shadow vanishes before the light source reaches the wall. Calculate the distance 
of the light source from the wall when the shadow vanishes.* 

Solution 

Firstly we discuss the phenomenon qualitative. We consider a point S along the 
orbit of the light source and let P be the point along the x—axis which is lighted 
from the light ray emitted from the source at the point S and greases the wall at the 
point T (see Fig. 8.2). The shadow at the point P disappears when this ray reaches 
the x—axis. Since light propagates with constant finite velocity, when this light ray 
reaches the point P, the light source will emit a new light ray from another point 
S’. This light ray will reach another point P’ at which the shadow will disappear 
(see Fig. 8.2). Depending on the velocity of the source it is possible that the time 
required for the light ray to cover the distance PP’ is larger than the sum of the 
time intervals required by the source to cover the distance SS’ and the light ray 
emitted from the point S’ to cover the distance S’P’. If this is the case the shadow 
disappears firstly at the point P’ and later at the point P! This behavior is different 
than the expected Newtonian behavior and it is due to the finite speed of light. 

Let us come now to the quantitative discussion. Let & be the rest frame of the 
wall and ©’ the rest frame of the light source. Let ty the moment of © at which the 
light source is at the point S and tp the moment at which the light ray reaches the 
point P, which we assume to be situated at a distance xp from the wall. From the 
geometry of the Fig. 8.2 we have: 


xp = hy tang (8.2) 
h 


(SP) = a6 = k/h2, + x3 (8.3) 


“For a different approach see Hon-Ming Lai (1975) “Extraordinary shadow appearance due to 
fast moving light” Am. J. Phys. 43, pp 818-820. 


274 8 Paradoxes 


where we have set k = hw < 1. We note that when the light source is infinitely far 


from the wall the angle ¢ = 5 and xp — oo hence the shadow covers all the part 
of the x—axis at the other side of the wall. 

The moment of & at which the light ray reaches the x—axis, destroying the 
shadow at the point P, is: 


SP h 
seep gs AO cag 2 (8.4) 
Cc ccos@ 


The speed at which the shadow disappears in & is given by the speed vp of the P 
in X. In time dts (of &) the source moves the distance udts and the point P the 
distance dx p = updts. From (8.2) we obtain: 


h 
dxp = — > dd. 
cos @ 


But from Fig. 8.2 we have: 


u cos? 
(SR) = (hs — hy)tandé => do = dts. 
hs — hwy 
where we have replaced d(SR) = —udts. The last two relations imply: 
dxp hy Uu 
= = = k> 1). 8.5 
eS Ee eS >) 


We note that the velocity vp has always direction towards the wall and has a constant 
speed depending (as expected) from the speed u of the source. Furthermore, we note 
that the value of the speed vp can be as great as we wish depending on the relation 
of the heights hs, hy. This result does not conflict the upper limit set by the speed 
of light in vacuum, because the point P is not a material point but an image (has no 
energy). 

We consider now a new position S’ of the light source after the point S and let 
P’ the point where the light ray emitted by the source at the point S’ and greasing 
the wall at the point T hits the x—axis. The time moment tp at which the shadow 
at the point P’ is destroyed is: 


SS’ S'P’ 
tp) =ts + ” +r ‘ 


SR) —(S'R h 
foe ) — (SR) S . 
Uu ccos@ 


h h h 
“(tan @ — tang’) + 2 


=ts+ : 
u ccos d’ 


8.2 Various Paradoxes 275 


We look now for a position S’ such that the light rays from the points S, S’ 
reach the x —axis at the same moment destroying the shadow at the respective points 
P, P’. To find this position we equate the times tp, tp: and find: 


h hs —h h 
ts + 2 =ts+ 2 (tang — tang’) + 2 
ccos@ u ccos d’ 
/ k-1 = / 
sin Cre = cos oe : (8.6) 
2 B 2 


Obviously there are values of 6’, d’ for which this condition is satisfied, therefore 
the point P’, which is closer to the wall, is lighted before the point P! In these cases 
the shadow in the rear of the wall is not destroyed from the motion of the point P 
but from the lightening of the point P’. 

The condition for the complete disappearance of the shadow is ¢’ = 0. When 
this is the case the angle ¢o is given by the relation: 


In order this value to be acceptable it must satisfy the condition ¢9 < 7 that is 


tan & < 1. This condition gives for B: 


B>k—-1 


which is possible because the rhs is < 1. We conclude that for speeds of the source 
B > k — 1 the shadow disappears, although the source has not reached the wall yet. 
This behavior contradicts the Newtonian one. For smaller speeds the Newtonian and 
the relativistic behaviors do not differ qualitative. 

Still we have still to calculate the position S of the source for which the shadow 
disappears completely. From the previous calculations we have: 


2tan2 = 2hsB(k— 1)? 
— tan? 2 B2 — (k—1)2" 


(SR) = (As — hy) tan bo = (hs hw) 


We note that as B — Othe (SR) — O that is, we have the expected Newtonian 
result. 

The critical distance xp. at which the shadow disappears along the x—axis is 
given by: 
2hyB(k — 1) 
peaked 


Xp.o = hy tango = 


Before that point we have two light rays destroying the shadow. Which of these will 
destroy first and how this will be done? To answer this question we calculate the 


276 8 Paradoxes 


. dx pr 
velocity vp: = dis 


have (we use (8.5)): 


with which the image point P’ moves along the x—axis. We 


dxp)  d hy ad > dq’ 
Up! = a = — (hy tan ¢’) = > ¢ a 2 ? 
dts dts cos* @ dts cos~ ¢’ dd 
_ cos* @ dd’ 
~ “cos? g! db 
Because 5 > @ > ¢’ the core < 1. Also from (8.6) one can show that —1 < 
dg! 


We note that the velocity vp: has direction opposite to that of vp and smaller 
magnitude. This is expected as the point P’ is closer to the wall hence it must move 
opposite to the point P in order to destroy the shadow in the interval xp — xp. 

Finally, we note that the length of the shadow along the x—axis equals: 


xp —Xp' = hy (tang — tang’) = 4, ea) >0 
cos ¢ cos ¢’ 


This is always positive, therefore the shadow is constantly diminishing. 


Example 8.2.3 In the LCF © a right angle lever BAC with equal lengths AB = 
AC = L is resting under the influence of a pair of forces f as shown in Fig. 8.3. At 
some moment, the lever starts sliding along the x—axis with constant speed u while 
remaining under the action of the same couple of forces in its rest frame. Obviously, 
the proper observer, &’ say, of the lever will observe no change (because nothing 
changes concerning the couple of forces). However, the observer & will “see” the 
angle to rotate, because due to length contraction of the side AB a net moment will 
apply to the lever. Explain the above paradox. 


Fig. 8.3. The paradox of the y 
L-shaped object Cc if 


en: 


8.2 Various Paradoxes 277 


It is given that the transformation of the four-force under a boost along the 
x—axis is: 


1 Uu 


f= Gay (fr — Sf-y) 
j 1 

y= @a- 5” 
ae 7 


yu) — #¥) 


Solution 
In the proper frame ©’ of the rod the velocity vanishes; therefore, the inverse 
boost from = to X’ of the components of the four-force gives: 


ty = ie 
fy = fly 
fe = fr/y). 


The forces which act on the lever angle in the LCF ’ are f, = fjandf. = fi. 
These forces in & transform to: 


- 


fg = (0, f/y(u), 0) = ——~j 
yu) 


fc = (f, 0, 0) = fi. 
We take moments in & wrt the point A of the lever angle: 


1 1 
SB(AB)s — fe(AC)s = f L 
y(u)” yu) 


fL=-f°fL <0. 


We note that the total moment about the point A in & does not vanish, hence the 
lever angle will rotate about A for the one observer (the X) and will not for the 
proper observer &’. The source of this paradoxical behavior is due to the assumption 
that the angle is a solid body. In Relativity (both Special and General) there are 
not solid bodies, because their existence relies in the Euclidian metric of the 3- 
dimensional space.> 


>More on this paradox which has a long history in Special Relativity the reader can find in the 
following articles: 


1. J.C. Nockerson and R. McAdory (1975), ‘Right-angle lever paradox’, Am. J. Phys. 43, 615 
2. D. Jensen (1989), ‘The paradox of the L-shaped object’, Am. J. Phys. 57, 553. 


278 8 Paradoxes 


Example 8.2.4 An equilateral triangle of side a slides with speed uw along the 
x—axis of the LCF &. Compute the perimeter of the triangle when it slides: 


a. Along one of its heights 

b. Along one of its sides. 
Comment on the results in the limit 6 — O (Newtonian limit) and 6B > 1 
(relativistic limit). 


Solution 


a. We consider the events A, B, C to be the position of the tips of the triangle at 
each moment. We know the coordinates of these events in the proper frame D’ 
say of the triangle and we compute them in the frame & using the appropriate 
boost. With the help of the Fig. 8.4 we obtain the following Table of coordinates: 


x ye 
A: (cta, XA, 0, 0) (ct/,, 0, 0, 0) sy 
: 3 
B: (ctp, xB, yB, 0)» (ets 373 5 OEY 
‘ 3 
Cc: (ctc, Xe, yc, 9)x (ty, >, — 3; Oar 


BA’ (cAtga, xB — XA, yB, 0) (0, ae 5>0)5/ 
CA (cAtcasxc— xa. yc. 0x | (0, S8,-4,0)z7 
CB" (cAtcg, xc — xB, yc — yc,9)=| (0,0, —a, 0) 


y 
¥y y B 
30° 
D > 
A x, a! 
Cc 


Fig. 8.4 Motion of an equilateral triangle 


8.2 Various Paradoxes 279 


The xg —x4 = XC — x4 = (AD)y =a’ cos¢ where a’ is the side (AB)y = 
(AC)» in & and ¢ is the angle BAD in &. From the triangle ADB we obtain 


BD) : 
sing = eae = x hence: 


xp —x, =a’ ae pe 
4a'? 4 


The boost gives xg — x4 = ee Replacing xg — x4 we find: 
3 
,_ vite 
a= 5 a 


Obviously in the LCF © the triangle ABC is isosceles with perimeter: 


T= 2a +a=[14 4-36]. 


Second Solution 

Since the side (BC) is normal to the relative velocity it will stay normal and 
with the same length, that is (BC)» = a. The points B, C are symmetric about 
the x—axis which coincides with the direction of the relative velocity hence they 
will stay symmetric in &. This implies that the point D is the middle point of 
(BC)y and in &. The side (AD)» due to length contraction in & has length 
av3 


mn Therefore: 
Y 


= 2 2 _ aa 3a? _a > 
(AB)» = (BD) ,+(AD)}, =,| re a) /4—3p 


from which the perimeter follows easily. 

b. This case is described in Fig. 8.5. We give only the “practical” solution and leave 
the coordinate solution for the reader. We consider the height A D and for obvious 
reasons we write: 


(BD) 3) = (BD) y = — 


3 
2 
1 a 

(AD)s = —(AD)xy = —. 

Y 2y 


280 8 Paradoxes 


/ 


y y 


B 
U 
TaN 
> 
A D Cc x, 2! 


Fig. 8.5 Motion of an isosceles triangle 


Therefore: 


(BC)s = (AB); = (AD) + (BD)2, = </4- p2. 


The perimeter of the triangle ABC for & is: 


T =2(AB) = +(A0)z =a y/1 = p+ fa]. 


Concerning the Newtonian and the relativistic limits of the perimeter we see that 
in both cases the Newtonian limit (8 — 0) of the perimeter of the triangle equals 3a, 
that is, the triangle behaves as a solid body and its angles and the lengths of its sides 
do not change. In the relativistic limit (8 — 1) in the first case, the perimeter of 
the triangle becomes 2a, the angle BAC — 180° and the triangle degenerates to a 
straight line. In the second case the perimeter equals a/3 and the angle ABC > o, 
that is the triangle degenerates again to a straight line of length twice the height BD. 

We see that the perimeter of a triangle in & depends on the way the triangle 
moves in &. This shows clearly that there are no rigid bodies (in the Newtonian 
sense!) in Relativity. 


Example 8.2.5. A classic paradox is the tween paradox originated in 1911 by Paul 
Langevin for which many papers and arguments have been put forward.° Consider 
two tween resting in the LCF & while at some moment one of the tween leaves the 
other tween, moves with constant velocity for a period of time in &, then changes 
instantly his direction of motion and moving again with constant velocity returns to 
the point in & where the other tween is waiting. When the tween meet they find that 


®For literature relevant to this paradox see for example D Greenberger (1972) ‘The Reality of the 
Twin Paradox Effect’ Am. J. Phys. 40, pp 750-754. 


8.2 Various Paradoxes 281 


Fig. 8.6 The tween paradox ” 


A 


world line of 
moving twin 
aN 
world line of 
standing twin 


2 


they aged differently and particularly the traveling tween is younger than the other. 
Explain this paradox. 

Solution 

Let 


— A be the event that the moving tween departs leaving the other tween in &, 
— B be the event that the moving tween changes its direction of motion and 
— C the event that the tween meet again. 


The three events are shown in Fig. 8.6 together with their world lines. 

Algebraic solution 

The age of each tween is measured by the length of the world line between 
the points A and C. Therefore the age of the standing tween is AC and that of 
the moving tween AB + BC. From the triangle identity in Minkowski space (see 
Proposition 11.3 Sect. 1.11) we have AC > AB + AC, that is the traveling tween is 
younger than the tween who stayed in &. 

Geometric solution 

The event B is simultaneous with the events B’, C’ of the tween in &. Therefore 
the duration AB of the moving tween equals AB’ of the standing tween and similar 
result holds for the lengths BC and C’C. Therefore the tween do not age the same 
and the additional additional aging of the standing tween in & is given by the length 
B'C’. 


Chapter 9 m) 
Mass: Four-Momentum a 


9.1 Introduction 


In the previous sections we considered the kinematics of Special Relativity, which 
concerns the study of the four vectors of position, velocity and acceleration. 
The major result of this study was the geometric description/definition of the 
Relativistic Mass Particle (ReMaP) as a set of four-vectors, which at each point 
along the worldline of the particle have common proper frame or characteristic 
frame depending if they are timelike or spacelike respectively. 

With the three four-vectors of four-position, four-velocity and four-acceleration 
one is able to study the geometric properties of the worldline of the ReMaP but not 
the individual characteristics of the ReMaP and its interaction with the environment. 
For example it is not possible to say if a given worldline concerns an electron 
moving in an electromagnetic field or a ReEMaP which is mechanically accelerated. 
Furthermore it is not possible to predict the worldline of ReEMaP moving under the 
action of a given dynamical field. In conclusion Kinematics studies the worldline 
only as a geometric entity independently of the cause and the internal structure of 
the system. 

The situation is the same with Newtonian Physics in which these three cor- 
responding quantities of Newtonian Kinematics do not suffice for the study of 
Newtonian motion. As we do in Newtonian Physics we introduce in Special 
Relativity new relativistic physical quantities, the Dynamical Relativistic Physical 
Quantities, which characterize the worldlines with more information. The set of all 
these quantities together with the “laws” which govern them comprises the field of 
Relativistic Dynamics. 


© Springer Nature Switzerland AG 2019 283 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_9 


284 9 Mass: Four-Momentum 


We recall that the relativistic physical quantities are Lorentz tensors which in the 
proper frame of the ReMaP must satisfy the following two conditions: 


e They must attain their reduced or canonical form 

e Either their components must be directly related to corresponding Newtonian 
physical quantities or their physical meaning must be defined by a relativistic 
principle. 


These two conditions must be satisfied by all dynamic physical quantities we 
shall introduce in the following. 

The simplest tensors are the invariants. Furthermore from the invariants we are 
able to construct new tensors using the rules of Proposition 1.4.1, that is either by 
multiplying or differentiating wrt an invariant. 

The first invariant we introduce is the (relativistic) mass and, using this, the four- 
vectors of four-momentum, four-force etc. 

We emphasize that the photons, and in general the particles with speed c are not 
ReMaP, therefore they do not have a proper frame. This implies that the definition of 
the dynamical quantities for these particles, hence their dynamics, is different than 
the dynamics of ReMaP and must be treated accordingly. 


9.2 The (Relativistic) Mass 


Before we define the (Lorentz) invariant dynamical physical quantity (relativistic) 
mass we refer some general comments which concern all invariant dynamical 
physical quantities. 


e Any (Lorentz) invariant is a potential relativistic physical quantity and it is 
characterized by the fact that it has the same (arithmetic) value in all LCF. 
Therefore it suffices to know the value of an invariant in one LCF. 

¢ In order a potentially invariant physical quantity to become a relativistic physical 
quantity its value in the proper frame of the ReMaP must coincide with a 
corresponding Newtonian physical quantity. If there is not such a quantity, then 
its relativistic physical role will be defined by means of a principle (for example 
the invariant c). 

¢ Every invariant relativistic physical quantity of a ReMaP, say the A°*, defines in 
the proper frame =* of the ReMaP a new potential relativistic physical quantity 
by means of the timelike four-vector (A, 0)s5+. One such four-vector we have 
already defined in kinematics by the invariant c, it is the four-velocity whose 
components in &* are (c, 0)5+. 

* In addition to the four-vector (A°t, 0)s+ the invariant defines more dynamical 
relativistic physical quantities by means of the rules 1, 2 of Proposition 1.4.1. 


The first invariant quantity we introduce is the (relativistic) mass m of the ReMaP. 
In order to make the mass a relativistic physical quantity we stipulate that in 


9.3. The Four-Momentum of a ReMaP 285 


the proper frame of the ReMaP its value will be identical with the value of the 
Newtonian mass of the ReMaP in that frame.! 

The mass of a ReMaP need not be a constant. For example in the case of a 
relativistic rocket the mass of the rocket changes along the worldline i.e. m = m(t) 
where T is the proper time of the rocket. In this case every value m(t) is a (Lorentz) 
invariant and the mass of the rocket is a continuous sequence of relativistic invariants 
parameterized by the proper time of the rocket. In this sense we consider the rate 
of change dm/dt of the proper mass of the rocket and whatever consequences this 
has. 

As we have already remarked the luxons (photons) do not have proper frame 
therefore for those we cannot define the relativistic physical quantity mass as we did 
for ReMaP. However we can define a “mass” which will be common for all luxons 
as a limiting case of the (relativistic) mass of the ReMaP. Indeed we note that the 
mass of a ReMaP is not possible to equal zero because no Newtonian particle has 
mass zero. However the mass can approach zero as closely as one wishes, hence 
zero is the minimum (lowest limit) of the mass of the ReMaP. Furthermore the 
Lorentz transformation being homogeneous preserves zero. Therefore we define the 
relativistic mass of luxons to be zero. As we shall see this choice is compatible with 
the dynamic physical quantities of photons, to be considered further on. 


9.3. The Four-Momentum of a ReMaP 


Consider a ReMaP P of mass m and four-velocity u'. By means of Rule 1 
of Proposition 1.4.1 we define the potential relativistic physical quantity linear 
momentum by the formula: 


i i 


p=mu. (9.1) 


p’ is a timelike four-vector with length: 


- =m ul uj =—m'*c* <0. (9.2) 


In the proper frame &* of P the reduced form of the four-velocity is (c, 0)5+ 
hence the four-momentum p; = (mc,0)y4. The zeroth component me of p! in 
the proper frame has physical meaning (by definition, not by Newtonian analogue!) 
because both m and c are relativistic physical quantities. Therefore the four-vector 
p! is a relativistic physical quantity. In Special Relativity the four-vector of (linear) 


‘Usually in the literature the relativistic mass of a ReMaP is referred as rest mass and it is written 
as mo. This is due to the fact that its value is defined in the proper frame of the ReMaP where the 
ReMaP is at rest. However it is important to become clear that the relativistic mass is a (Lorentz) 
invariant hence its value is the same in all LCF and does not change “with the velocity” as it is 
erroneously claimed. Therefore the term “rest mass” is misleading and should be abandoned. 


286 9 Mass: Four-Momentum 


momentum is as important as the linear momentum vector in Newtonian Physics. 
As it will be shown in the following, its components encounter both for the energy 
and the 3-momentum of the ReMaP in a LCF. 

To find the components of p' we recall that in © the four-velocity u' has 


i c 
components u’ = y 2 hence the four-momentum: 


x 
pate, mc 
par(m) (9.3) 


In order to reveal the physical meaning of the components of the four-momentum in 
the LCF &, we consider the Taylor expansion of y around the value 1 and compare 
the result with known Newtonian quantities. For the spatial components we have: 


myu =mu+ O(u /c?)u. (9.4) 


mu is the “Newtonian” linear momentum of the ReMaP P in &. Therefore it is 
reasonable to assume that the quantity: 


pany (9.5) 


is the (linear) 3-momentum of P in X. 
Working in a similar fashion we find for the zeroth component of the four- 
momentum: 


1 
myc? = mc? + 5 mu> + O(u4/c?). (9.6) 


The quantity smu is the Newtonian kinetic energy of P in &, therefore the terms 
myc? and mc* must be related to energy quantities. The term mc? involves only the 
ReMaP P and it is independent of its motion. We identify this term with the internal 
energy of the ReMaP P. Obviously the next term 5 mu* must be identified with the 
kinetic energy T of the ReMaP P in ©. Finally we identify the term myc? with the 


total energy E of P in &: 
E= myc’. (9.7) 


With these identifications the four-momentum of P in & has the following 


representation: 
pa) (9.8) 
Pes 


9.3. The Four-Momentum of a ReMaP 287 


Using (9.7) and (9.8) and replacing in (9.2) we obtain the following fundamental 
formula which relates E, p,m: 


E =,/ p*c? + m*c4. (9.9) 


This relation is possible to be displayed on the Euclidian plane by means of an 
orthogonal triangle (see Fig. 9.1). T is the kinetic energy of P in D. 


Exercise 9.3.1 Prove that the angle @ in Fig. 9.1 is given by the relation: 


pe 
sing = —= Pp. 
e=— =P 


Determine the relation of ¢ with the rapidity of the ReMaP. Using this result 
represent geometrically the boost in terms of the four-momentum. (Hint: Use the 
invariants of the Lorentz transformation) 


The triangle of Fig.9.1 allows one to use @ and distinguish between the non- 
relativistic (6 — 0) and the ultra relativistic (6 — 1) motions (see Fig. 9.2). 

Indeed we note that for low velocities (in X!) the major part of the (total) energy 
E is due to the mass mc” while for high velocities the energy is mainly due to the 
momentum pc, that is, the kinetic energy. This observation is important in practice, 
because indicates which terms could be ignored without affecting significantly the 
final result. 

The energy is the zeroth component of the four-momentum, therefore its value 
depends on the LCF & where the motion is considered. The following mistake is 
often made. Because the quantity 5 = my has dimensions of mass, some people set 
M = my and assume that M is the “mass” of the ReMaP. Then they conclude that 
“the mass depends on the velocity”. This is absurd because M is not a relativistic 
physical quantity (it is not a Lorentz invariant) hence it has no physical meaning in 


Fig. 9.1 The relativistic 
energy triangle 


Fig. 9.2 Relativistic and 
Newtonian limit of the energy 
— 3-momentum relation 


[plc 


Iplc<E [ples B 


288 9 Mass: Four-Momentum 


Special Relativity. Simply it is another name for the energy FE of the ReMaP and 
varies from frame to frame according to the Lorentz transformation of the zeroth 
component of the four-momentum. 


Example 9.3.1 


(a) Show that if the momentum of a ReMaP is larger than the mass by two orders 
of magnitude, then the energy equals the measure of the 3-momentum to the 
order of 10~4. Conclude that in these cases practically E = pc. This is the ultra 
relativistic case. 

Repeat the calculation assuming that the mass of the ReMaP is two orders 
of magnitude larger of the (length) of 3-momentum and show that this case 
corresponds to the non-relativistic case, therefore we may take E = mc’. 


(b 


we 


Solution 


(a) From (9.9) we find: 


24 
mc 

EB? =p??? + mc = p*c? (1455 *): 
pc 


Assuming that the momentum is two orders of magnitude larger than the mass 
term mc we have: 


1 
E = peV1+ 10-4 © pe + 510). 


Therefore the energy equals the momentum to the order 10~*. It follows that 
sing = O(10~*) (ultra relativistic limit). 


(b) Working similarly we find: 


ee, pe 
E=mc E 7A 


2 
+0 (+—) ~ mc2(1 + 1074). 

m m<c 

From (9.9) we conclude that the energy of a ReMaP P in a LCF © varies with 
the speed of P in &. This is also the case with Newtonian Physics. The difference 
with Special Relativity is that for relativistic speeds (6 > 0.8) the change of energy 
with the velocity is much higher and tends to infinity as B — 1 (see Fig. 9.3). 
This implies that it is required infinite supply of kinetic energy for a ReMaP to be 
accelerated to the velocity c, which is consistent with the relativistic requirement 
that c is the highest possible speed and furthermore either a particle is and stays for 
ever a ReMaP or it is and stays for ever a photon! 

From (9.5) and (9.7) we have that in a LCF ©: 


pe 


p= =. (9.10) 


9.3. The Four-Momentum of a ReMaP 289 


Fig. 9.3. Change of energy E83 
with the speed 4 ‘Relativi stic case 


2.6 4 


Newtonian case 


1O 02 04 06 08 42 


In spite of its simplicity relation (9.10) is very useful because computes directly 
the 6—factor if the 3-momentum and the energy of the particle are known. Another 
useful relation we find if we differentiate (9.9): 


EdE = pdpc = |p\d\plc (why?) 


from which follows: 


_ dE 
d|p\c’ 


(9.11) 


Exercise 9.3.2 Let 4, X/ be two LCF which are related with a boost in the standard 
configuration with velocity factor B. Let P be a ReMaP with mass m which in &, 
x’ has linear momentum p, p' and energies E, E' respectively. Prove the relations: 


E' = y(E — Bpxc) (9.12) 
P = Y (px — BE/c) (9.13) 
Py = Py, PL =Pz- (9.14) 


Example 9.3.2. Consider P, X, &’ as in Exercise 9.3.2. Prove the relations: 


/ 1+ B ) ae = pid 2 oy} 
(= +50.) (x: | $0. = Bry "(py + pz +m*c*) 
/ Lap / a — 92.2272 2 22 
(2 - | 40) (« be) = Bey*c’ (py + pp +m*c*) 


Plot these relations when p, + pe =constant. What happens when P is a photon? 
Solution 

The four-momentum of P in © and D’ is p; = (E/c, p)y and p; = (E’/c, p’)y 
respectively. The two expressions of the four-momentum are related by a boost 


290 9 Mass: Four-Momentum 
along the common x, x’ axis. Relation (9.13) gives: 

(Py, — Px)! = ype = Bp +o) 
where a? = D; + Pp + m?c?. This relation can be written: 


(p), — ypx)? — (vBpx)” = y? Ba? 


or 
[p, — (vy + vB) Pell, — (v — vB) px] = y*B? a’. 
But: 
1+, 
+ = Tt — a nee 
yt By =yd+8) 1x6 
Replacing we find: 
) Lp / l=, pi. 39 
(x = Sn.) (x = hn) = pry*a’. (9.15) 


Working similarly for the energy we show: 


/ 1+ 6 / I— , = 9 BIDS 
(- [ +B.) (« vee) = B2y2a?. (9.16) 


If a? =constant these equations describe hyperbolae with asymptotes the straight 
lines: 


and respectively: 


re co ee ee 
1-B 1+6 


The plotting is left to the reader. In case P is a photon relations (9.15) and (9.16) 
become identical and furthermore the mass m = 0. 


9.3. The Four-Momentum of a ReMaP 291 


Exercise 9.3.3. The LCF © and &! are moving with parallel axes and relative 
velocity u. A ReMaP P has four-momentum (E/c, p)s and (E'/c, p')y in X, X’ 
respectively. 


a. Show that the components of the four-momentum in &% and &' are related as 
follows: 


E' =, (E-u-p) (9.17) 


u- E 
p=ptul Poy 1) n=]: (9.18) 


Show that in the case of a boost these relations reduce to (9.12), (9.13), and 
(9.14). 
b. Show that the angles 0, 6’ of the 3-momenta p, p' with the relative velocity u of 
= and =’ are related as follows: 
Bol 
to’ = | coté — ——— 9.19 
- (cc B’ sin@ ( ) 


where B' is the speed factor of P in X'. Relation (9.19) is called the particle 
aberration formula. It is used in the focussing of beams of particles. 


Exercise 9.3.4 Let p, E be the 3-momentum and the energy of a ReMaP of mass m 
in the LCF &. Prove the relations: 
dp =ydmu+md(yu) (9.20) 
dE =mc*dy + yc?dm. (9.21) 
Deduce that the change of 3-momentum or the energy of a ReMaP in & is due either 
to a change of y (the speed) or to a change of mass or both. 


Exercise 9.3.5 Assumptions as in Exercise 9.3.4. Prove the relations: 
Fame p= By2m2e2. 


Deduce that in order to compute the energy and the measure of the 3-momentum of 
a ReMaP of mass m in & (not a photon!) it is enough to know the y—factor of the 
ReMaP in X. 

Note: For the photons we have pc = E = hv where h is the Plank constant and 
v is the frequency of the photon. Hence in the case of a photon it is enough to know 
the frequency of the photon in X. 


292 9 Mass: Four-Momentum 


Example 9.3.3 A ReMaP of mass m and four-velocity u' has four-momentum p! = 
mu’. Prove that the energy and the length of the 3-momentum of the ReMaP in a 
LCF © in which the ReMaP has velocity factor y, are given by the relations: 


E=-—y(p'uj) (9.22) 
2 _ (pin. 27.2\¢ pi, \2 
P= (P pi) + (V"/c")(p'ui) (9.23) 
Solution 
We note the relations p!u; = —mc? and p! p; = —m?c*. Then: 


E =myc’ = —y(p'ui). 
VV ody cl 
p= Mme = (5) (p'uj)~ + (p' pi). 


Example 9.3.4 A beam of electrons of average energy E is produced in a linear 
accelerator. The focussing of the beam is assumed to be perfect and also that all 
electrons have the same energy EF with deviation w%. This means that the velocities 
of the electrons are parallel and their energy is between the values Fi = (1+ a)E 
and E_ = (1 — a)E. Calculate the maximum relative speed of the electrons in the 
beam if a = 0.1, E = 1GeV, m = 0.5 MeV/c’. Does the result depend on the 
value of the average energy E? 
Solution 

Let ui, u'_ be the four-velocities of the electrons in the beam with energies 
Ex, = (1+ aq)E and E_ = (1 — @)E respectively. The relative velocity of the 
electrons is the four-vector u!,, = ui, —u'_ and has length: 


u 


iusi = (ui, —ul_)(us; — ui) = —2c? — Qui ui. (9.24) 


We calculate the inner product ue ae in the proper frame of u‘_. In that frame 


ul = (3) ; ul, = ( ons ) where y+ is the y—factor of the relative velocity 
Yves ) _ 


vi. The product ui uj — —yxc?. Replacing in (9.24) we find: 


‘uti = 2(yt — 1c’. (9.25) 


u 


We note that the length of the four-vector u',. is positive hence uw‘, is not a four- 
velocity! 
We calculate y+ in the laboratory where the four-velocities ul, ul. 
: i : i f 
From the relations u', = Pe ui = © and the relation ui u_j = —ysc* we find: 
m m 


are given. 


So i . 
V+ —_ wig te 


9.3. The Four-Momentum of a ReMaP 


293 


We calculate the inner product p', p-i in the laboratory. We have the components: 


from which follows: 


The length: 


and similarly: 


Therefore the product: 


1 
pepee’ Ey EL : 5 ( 


r= ( 


E./c 
p+e 


i E_/c 
)e-(F8), 


1 
ae + pene 


Replacing in (9.26) we find: 


Y4 


Re 


1 


E. 
¢ |. 
a 


The 6 factor which corresponds to the y; 


1 


3 


E_ 


Ey 


Bt 


. — factor is given by the relation: 


2 2 
E2 — E2 


= 2 2° 
Eq +E* 


(9.26) 


294 9 Mass: Four-Momentum 


In terms of the deviation a of the energy we have: 


_ (l+a)?-(-a)’ 20 


~ (l+a)?+(l-—a@)2 14a?’ 


Bs 


We note that in our approximation (which is reasonable) 6s is independent of the 
average energy E and depends only on the deviation a. 


9.4 The Four-Momentum of Photons (Luxons) 


Luxons are particles (photons and probably other particles) whose speed in all 
LCF equals c. These particles do not have a proper frame. They play a double 
fundamental role in Special Relativity. They are the natural “bullets” which are 
used by the relativistic observers in order to coordinate the events in spacetime 
(chronometry) and also they are the carriers of information among these observers. 
Beyond that, the photons are fundamental particles in the constituents of nuclear 
reactions. Therefore the study of their kinematics and especially their dynamics is a 
must. However due to their characteristics they have “peculiarities” which must be 
taken into account. Let us start with the geometric considerations by assuming that 
the only luxons are the photons. 

The worldlines of photons are null straight lines in Minkowski space M‘. If x! is 
the position vector of a photon, then x! is a null four-vector and in any LCF ) can 
be written in the form: 


x! = A(t, nr) (z) (9.27) 
x 


where A(t, r) is a scalar function (not invariant!) depending on the coordinates f, r 
of the photon in © and é is the unit 3-vector in the direction of propagation of the 
photon in ©. For example a photon which propagates along the 3-direction (1, 1, 1)’ 
in & and the moment t = 0 of & passes through the origin has position vector: 


1 1 1\ 
i fe ee ee 9.28 
‘ ar( J3 J/3 =), ae 


The four-velocity of a photon cannot be defined by the relation dx! /dt because 
photons do not have proper time. The same holds for the four-acceleration, which 
has also the additional constraint that the speed in all LCF equals c. We also have 
a problem with the four-momentum because one cannot consider mass with the 
standard connotation for the photons, except as a minimum of the mass of the 
ReMaP. 

In order to circumvent these difficulties and be consistent with what we have 
done already, we continue using the process of the limit. We start with the four- 
momentum. 


9.4 The Four-Momentum of Photons (Luxons) 295 


Because we have assumed the “mass” of photons to equal 0 the four-momentum 
p! pi = —m?c* = 0 must also be a zero four-vector. This means that in any LCF ©, 
p’ can be written in the form: 


pi = A(t.) @)s (9.29) 
where B(t, 1) is a scalar (but not a Lorentz invariant!). From Newtonian Physics 


we know that a photon of frequency v, which propagates along the direction é, has 
energy E and momentum p given by: 


E=hv (9.30) 
hv. E, 

p= —e= —e. (9.31) 
Cc Cc 


Based on the above analysis we define the four-momentum of a photon in a LCF 
in which the photon has energy E(= hv) and 3-momentum p(= ar) with the 
relation: 


i E/c 
= 7 : 9.32 
. (aan a 


This definition connects the two natures of the photon that is, the wave and the 
particle nature. 


Example 9.4.1 In the LCF © a photon has energy E and 3-momentum p. Calculate 
the energy E’ and the direction of propagation e’ of the photon in an LCF =’, which 
moves relative to = with velocity u. Express the angle 6’ between e’ and u in terms 
of the angle 6 between e and u. Comment on the result. 
Solution 

Suppose the four-momentum p! of the photon in © and D’ has representation 
Di = (E/c, p)s and p; = (E'/c, p’)y respectively. The Lorentz transformation 
parallel and normal to the relative velocity u gives (recall that py = pou): 


E'/c = y(u)(E/c — B - pj) (9.33) 
E’ x 

= =p (9.34) 
E’,, E 

—é) = y(u) (p - p=) (9.35) 
Cc Cc 


From (9.33) we have for the energy E’ in ©’: 


E'=y(u)(E —p-u). (9.36) 


296 9 Mass: Four-Momentum 
Adding the remaining two equations we have: 
i p-u E 
(E’/c)@ = pi + [yw Pe = ys | u 


p-u E 
2 ys |e 


=pt jroo 1) 


from which follows: 


é c +m -v2S-yws 0.37) 
= u u ut. ; 
yu(E = peu) (PL ee 
Relation (9.37) is known as light aberration formula. [Prove (9.37) by direct 
application of the Lorentz transformation]. 
Concerning the angle 6 we consider 0 #£ km/2, k = 0,1,2,3 and |pc| = E 


(because we have photons) hence |pj| = z£ cos 0, |pi| = z£ sin 0. It follows: 
le, E 0-—(E 
wot’ = etl yy El 2088 = (EB 
le’ | (E/c) sin@ 
6 — 
= yw) PF _ yw) (core - 2). (9.38) 
sin 0 sin 6 


[Compare this relation with (9.19). What are their similarities and differences? ] 

We consider next the special cases 90 = km/2, k = 0,1,2,3. When 0 = 0 
from (9.34) we have p, = 0 => @, = 0 hence 6’ = 0, that is, the directions 
of propagation of the photon on © and D’ coincide. The same result we find if we 
work directly with (9.37). Indeed we have (u = ué, p = £8): 


Cc 


= Ee 1 BE E).]_. 
ca y(u)(E — BE) |=é+ jor ) re) ys | | =e. 


When 6 = 3 then p - u = 0 and (9.37) gives: 


n 14 7 
c= ml — Bye). 


In this case the cot@’ = —By < Ohence 6’«(4, z). 
Working similarly we show that when 6 = z the 0’ = x and when 6 = 37/2 
the 6’e(—1/2, 2/2). 


9.6 The System of Natural Units 297 
9.5 The Four-Momentum of Particles 


In the previous sections we defined the four-momentum of the ReMaP and the 
photons in a different way. The necessity for this approach was due to the different 
character of this four-vector for each class of particles, that is timelike for the 
ReMaP and null for the photons. However in nuclear reactions both types of 
particles are involved therefore it would be useful to have a unified approach. This 
lead us to the introduction of particle four-vector which is a four-vector whose 
(Lorentz) length is either negative or zero. This approach gives as the possibility to 
give a dynamic definition for the elementary particles. 


Definition 9.5.1 Particle in the Theory of Special Relativity is any physical system 
whose four-momentum is a particle four-vector. 


This definition allows us to use all the results on particle four-vectors in the study 
of relativistic collisions. However it is to be noted that the results of that section 
apply to all particle four-vectors not only to the four-momentum. 


9.6 The System of Natural Units 


In Newtonian Physics the physical quantities spatial length (L), time (T) and mass 
(M) are absolute (i.e. Euclidian invariants) therefore it is reasonable to develop a 
system of units whose fundamental elements are the [L,T,M]. This system is the 
International System of Units (SI Units) which is an evolution (in 1960) of the well 
known MKSA system with the addition of the units of Kelvin, Candela and (later 
on) the mole. 

In the Theory of Special Relativity the quantities T, L are not relativistic physical 
quantities hence the SI, although it can still be used, is not anymore inherent in that 
theory. 

In order to find a new system of units suitable for Special Relativity we consider 
the new fundamental invariant relativistic quantities. The first such quantity is the 
value c, therefore in the “relativistic” system the universal constant c will equal the 
pure number 1. The new system we call the System of Natural Units (NU). 

In this system the unit of spatial length | m is related to the unit of time 1s with 
the relation: 


1s =3x 108m. (9.39) 


The independent dimensions in the System of Natural Units are (among other) the 
[L M]. 


298 9 Mass: Four-Momentum 


In order to convert the value of a quantity from the SI system in the system of 
Natural Units one applies the following simple rule: 


m 
Ss 
[c] = 3x 108 in the same power as the time T appears in the units of the physical quantity 
in the SI system. 


Multiply the value of the physical quantity in the SI units with the quantity c = [c] “= where 


Let us see some applications of this rule. 

Spatial length (Z in SD 

The dimensions are the same in both systems of units, i.e. 1 m. Indeed, according 
to the rule 


1m in SI = [c]° (m/s)°m = 1min NU. 


Time (T in SD 
The unit of time in NU is the spatial length m. Indeed according to the rule: 


1s in SI= [c]! (m/s)! sin NU = [c]m 


which is in agreement with (9.39). 
Velocity(LT~! in SI) 


ims! in SE= [c]7! (m/s)7! (m/s) = [ce]! in NU. 


We note that in NU the velocity is a pure number, as expected. 
Acceleration (LT ~? in SI) 


Ims in SI = [c]~? (m/s)? m/s? = [c]~-? m7! in NU. 


Force (MLT~? in SI) 


IN inSl=1Kgms~? = [c]-*(m/s)~? Kg m/s? = [c]~* Kg m7! in NU. 


Exercise 9.6.1 Show the validity of the following transformations between the 
values of the physical quantities in the SI and the NU system. 
Energy (Nm): 


1J in SI =[c]~? Kg inNU. 
(Note that the dimensions of energy in NU is mass (kg) which is compatible with the 
relation E = mc? and allows us to measure energy in K g and conversely mass in 


KJ). 
Pressure: 


1Nm~? in SI = [c]~? Kg m~> inNU 


9.6 The System of Natural Units 299 


Energy density: 
1 Jm~? in SI = [c]-? Kg m~?> inNU 
Mass density: 
1 Kgm~> inSI=1 Kgm~> inNU 
Momentum: 


1 Kgms—! inSI=[c]"! Kg inNU 


(Note that in the NU system the dimensions of energy are identical with those of 
linear momentum). 
Plank constant: 


h = 6.63 x 10-4 Js in SI = 6.63 x 107*4[c]~* Kg x [c]m 
= 2.21 x 10°” Kgm inNU. 


Except the introduction of the NU system, there is also the necessity to adjust the 
relativistic units of measurement to the relativistic reality. Indeed as it is nonsense 
to measure the speed of a car in mm/days, similarly it makes no sense to measure 
the energy of an elementary particle in J. The basic physical quantity in the study of 
elementary particles (at the level we are dealing with) is the energy. Since in NU the 
units of energy are identical with the units of mass (because c = 1 and E = mc’), 
we measure the mass of elementary particles in units of energy. For reasons we 
mentioned above these units are not the erg (C.G.S.) or the Joule (S.I.) but the 
eV and its multiples. By definition 1 eV is the kinetic energy acquired by an electron 
which starts from rest and moves in empty space between two points whose potential 
difference is 1 V. Let us calculate how much energy is an l eV. 

The charge of the electron equals 1.6021 x 10~!° Cb, therefore: 


leV = 1.6021 x 10°? J = 1.602 x 10°" erg. 
Inverting this relation we find: 
1 J =6.24x 10! ev. 


The multiples of eV are the MeV = 10°eV, GeV = 10° eV = 10° MeV and the 
TeV = 10" eV = 10° GeV = 10° MeV. Obviously: 


1 MeV = 1.602 x 10° 8 J = 1.602 x 10° erg 
1J = 6.24 x 10 Mev. 


300 9 Mass: Four-Momentum 


As we have mentioned, in the NU the units of energy (e.g. MeV) have the 
same dimension with the units of mass. In Exercise 9.6.1 we have noted that the 
dimensions of linear momentum in the SI system follow from the units of energy in 
the NU if one multipliers with the factor [c]~!. Similarly the units of mass in the SI 
system are found from the units of energy in the NU system by the multiplication 
with the factor [c]~7. For this reason the units of 3-momentum are written as MeV /c 
and the units of mass as MeV/c’. 
Let us see some practical calculations. 


Example 9.6.1 The mass of the electron in SI system equals 9.1 x 10778 g. What is 
the mass of the electron in MeV/c? (that is in the NU system)? 
Solution 

We have: 


Mass of electron in NU (MeV/c?) = 
Mass of electron in g x [c]* = (9.1 x 1077! Kg) x (3 x 10° m/s)? 


8.19 x 10-!4 
= =14 7. = 


Example 9.6.2 The mass of the proton in the NU system equals 938 MeV/c’. 
Calculate the mass of the proton in g (that is in SI units). 
Solution 

We have: 


[Mass of the proton in g(SI)] 


[Mass of the proton in 938 MeV /c?(NU)| |e? = 
[ (938 x 1.602 x 107°) erg /G x 10'° cm/s)” = 1.67 x 1074 g. 


Exercise 9.6.2 


a. Show that if the mass of a ReMaP equals m g, then its mass in NU equals 
5.616m x 107° MeV /c?. 

b. Show that if the mass of a ReMaP in NU equals E MeV/c’, then its mass in g 
ism = ts x 10-76 g. 

c. A particle has linear momentum 15 GeV/c. Show that in SI its momentum equals 
8.01 x 107'8§ Kgms7!. 


Example 9.6.3 Inthe LCF © two electrons have equal kinetic energies 1 MeV and 
are moving in opposite directions. Compute: 

a. The speed of each electron in & 

b. The relative velocity of the electrons 

(It is given that the mass of the electron in NU is m = 0.51 MeV/c’). 


9.6 The System of Natural Units 301 


Solution 


a. 


If T is the kinetic energy of the electron then the total energy E = T + mc~. But 
E = myc’. Therefore: 


T +mc2 T 
mc? mc? 


. The relative velocity of the electrons equals the velocity of one of them in the 


rest frame of the other. The relativistic rule for spatial velocities gives: 


u+u 2B 
— c 
l+uu/c2 1+? 


[Vet] = 


where 6 = u/c is the 6 factor of each electron in X. Replacing, we find |v;e1| = 
0.998c. 


Example 9.6.4 Calculate to the order of the third decimal digit the 6 factor of a 
pion of momentum 10.0 GeV/c. It is given that the mass of the pion is 1.40 x 
10-! GeV/c’. 

Solution 


It is given: | pc| = 10.0 GeV and mc* = 1.40 GeV. The: 


Ipel _ pel pel 
== = —, = 7.14 
p E myc? wey mc? 


therefore: 


y= VJ14+(6y)? =7.21 


1 
B=,/1— — = 0.990. 
y 


Chapter 10 m) 
Relativistic Reactions om 


10.1 Introduction 


According to the Newtonian point of view, matter consists of absolute units, the 
particles, which were created once and ever since their number and their identity 
are preserved. Today we know that this point of view is not valid. In nature, nuclear 
reactions occur at all times so that particles are created and destroyed in such a way 
that their number and identity are constantly changing. A dramatic example of such 
a change is the explosion of the nuclear bomb and on a bigger scale the “burning” 
of the mass of the sun. A world, in which both the number and the identity of the 
particles are not constant but change, either spontaneously or by external causes, is 
compatible with the point of view of Special Relativity. Because, according to that 
theory, a system: 


1. Is characterized only by the values of the various relativistic physical quantities 
(mass, charge, spin etc) defining the system. 

2. Is described by means of states, each state being an aggregation of a different (in 
general) number and type of particles and fields binding them. 

3. The state of a system can change to another state either spontaneously or by 
external causes. We say that each change of state is a state transition of the 
system. The physical processes which cause phase transitions of particle systems 
we call particle reactions or scattering. 

4. The state transitions of a system take place so that all the physical quantities 
(mass, charge, spin etc), which define the system, are conserved. 


For example according to the relativistic view the reaction: 
n> pte +g 


describes the transition of a physical system, which in one state appears as a neutron 
and in the other as the aggregate of a proton, an electron and a neutrino. What is the 


© Springer Nature Switzerland AG 2019 303 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_10 


304 10 Relativistic Reactions 


same in the two “appearances” of the system are the total four-momentum, the total 
charge, the total spin etc. As a second example, the states electron and y + y are 
not possible for a system, because the charge is not conserved. This implies that the 
Newtonian distinction of systems in simple and complex according to the number of 
particles they involve, does not apply. Indeed, as we have seen in the above example, 
the same system in one state appears as a single particle (the nm) and in another as 
the aggregate of three particles (the p, e~ , vg). Furthermore, it is possible that each 
particle is by itself a system and in some states can appear as more particles and so 
on. 

If in a reaction (or scattering) of particles the number and the identity of particles 
are preserved, we call the reaction elastic. In all other cases, the reaction is called 
inelastic. 

In practice, particle reactions are achieved with the collision of beams of fast 
moving particles either with targets at rest in the laboratory or with other traveling 
beams. These beams of particles (the mother particles) are created in special man 
made machines (particle colliders, linear accelerators, synchrotrons etc.) or in 
interstellar space e.g. by magnetic fields of strongly magnetized stellar bodies. The 
particles, which emerge from a reaction (the daughter particles) are observed with 
special machines, which measure their distribution in space (number density and 
energy density in a specified direction and a small solid angle) in some LCF. For 
a long time the experimental measurements involved photographing and counting 
the particle trajectories (bubble chamber), but today there are more accurate and 
powerful electronic methods employing computers and photosensitive materials 
(scintillator etc.). In every case, what we observe in a particle reaction is a set 
of orbits, whose study provides us with information concerning the values of the 
physical quantities of the system in the LCF we are working. The experimentally 
measured quantities we observe are: 


1. The identity of the particles (partially) 

2. The distribution of the space velocities of the particles (i.e. the number of 
trajectories per given direction and unit solid angle) 

3. The distribution of energies and 3-momenta (curvatures and other geometric 
elements of the orbits in given electromagnetic fields) 


The “metamorphosis” of a physical system from one state to another can be 
studied in two stages. The first stage considers the initial and the final state and 
it is concerned only with the conservation of the physical quantities characterizing 
the system. The second stage studies the dynamics, that is the mechanism, which 
brings the initial state to the final state. In the following, we shall be concerned with 
the first study, sometimes called the kinetics of the reaction. The second approach 
requires the methods of Quantum Physics and it is outside the scope of this book. 


10.3 Relativistic Reactions 305 


Fig. 10.1 Spacetime 1 / 
representation of a reaction 
ae 
m 


t 


10.2 Representation of Particle Reactions 


The spacetime representation of a particle reaction is shown in Fig. 10.1. This 
representation simply shows the particles which react and the particles which are 
produced. The dark circle represents the mechanism of the reaction (the “dynamics” 
of the reaction) and, as we remarked above, concerns the study of the reaction by 
means of quantum field theory. 

The study of the kinetics of a particle reaction is also done in two stages, as 
it is done with conventional chemistry. In the first stage, the reaction is studied 
stoichiometrically, that is, we consider a small number of individual reacting 
particles and write: 


At+tB+-::- 73 A+B’... 
1 DO Ae aa |’ a ee 


This defines the initial and the final states of the system. Then one considers the 
conservation equations of the physical quantities of the system in one or more 
states and draws conclusions with the purpose to explain or foresee the experimental 
data. In the second stage, the reaction is studied qualitatively, that is, one considers 
distributions of particles (not individual particles) and subsequently studies the 
conservation of the various physical quantities of the system. In the following, we 
shall be concerned with the stoichiometric study only, because the quantitative study 
requires special knowledge beyond the scope of this book. 


10.3 Relativistic Reactions 


In Special Relativity the basic physical quantity, which characterizes a system of 
particles is the four-momentum. The four-momentum of a relativistic system in a 
certain state is the vector sum of the four-momenta of all components comprising 
the system. If the state of the system consists only of free particles, then the four- 
momentum of the system equals the sum of four-momenta of all individual particles. 
If the state of the system contains fields of interaction among the particles, then 
the four-momentum of the system contains the four-momenta of the particles plus 
the four-momenta of the fields of interaction. For example, the four-momentum 
of an atom includes the four-momenta of the nucleus, and the electrons plus the 


306 10 Relativistic Reactions 


momentum of the electromagnetic field coupling the nucleus and the electrons. 
Furthermore, the nucleus itself is a system whose four-momentum is the sum of the 
momenta of the nucleons which make it up and the hadronic and the electromagnetic 
fields which couple them. Moreover, each nucleon is a relativistic system and so 
on. We note that there is an endless process, which does nor allow for a clear-cut 
distinction between particle momentum and field momentum. Therefore, what is 
experimentally measured is the total momentum of an experimentally identifiable 


system. 
Consider a set of particles P;, P2,--- , P, with corresponding four-momenta 
Pi, P2,°** » Pn. We say that these particles react or collide if their world lines (in 


spacetime!) have a common point as shown in Fig. 10.2. 

The reaction is a common event for all particles involved. As a result of the 
reaction, suppose that new particles Q1, Q2,---,Qm are produced with four- 
momentum q1, q2,--- , Gm respectively. The new particles are in general different 
both in identity (Pia) 4 Q,g)) and in number (see Fig. 10.3). 

The world lines of the daughter particles have again a common spacetime event, 
which is the same with the event of reaction/collision of the mother particles. 
Therefore, at the spacetime event of the reaction we have the following sets of 
timelike or null four-vectors: 


Pi, P2,+++s Pn and q1, G2,+++5dm- 


These vectors are elements of Minkowski space, therefore they must comply with 
the geometry of that space. In Chap. | we stated Proposition 1.11.1, which concerns 
the sum of particle four-vectors and the triangle inequality in Minkowski space. Let 
us see the effect of each of these geometric results on the four-momenta. 


Fig. 10.2 Schematic reaction point 
representation of the reacting : 
(mother) momenta F 
Pi : : 
° e * 7 
po Pn 


Fig. 10.3 Schematic 
representation of the 
produced (daughter) 


qh 
momenta se) ea Gn 


reaction point 


10.3 Relativistic Reactions 307 
10.3.1 The Sum of Particle Four-Vectors 


According to Proposition 1.11.1, the sum of a finite number of future/past directed 
timelike or null four-vectors (i.e. particle four-vectors) is a future/past directed 
timelike four-vector except, if and only if, all vectors are null and parallel, in 
which case the sum is a null vector parallel to the individual vectors. Let us see 
the significance of this result in the case of spacetime collisions. The second part 
assures that in Special Relativity the light beams exist and are geometrically sound 
quantities. Indeed, a light beam is defined by a bunch of parallel null straight 
worldlines and according to Proposition 1.11.1, the sum of such an aggregate of 
lines is again a parallel null line corresponding to the beam. Furthermore, it is easy 
to show that if two photons (or null straight worldlines) are parallel in Minkowski 
space, then their directions of propagation in Euclidean 3-space are also parallel. 
Therefore, a light beam in Newtonian space is also a light beam in spacetime. This 
is an important conclusion otherwise we would not be able to coordinate spacetime 
by means of light rays (i.e. to do chronometry in Special Relativity) working within 
our Newtonian environment. 

Concerning the first part of the Proposition we have that — excluding the case 
that all four momenta are future/past directed null and parallel — the sums P! = 
ae Py and Q' = aa qi, are timelike four-vectors. This geometric result, 
when transferred to the relativistic reactions, means that they must be of the general 
form shown in Fig. 10.4 below: 

Certainly, geometry does not (and should not!) say which particles are produced 
when certain particles are involved in a relativistic reaction. This is the work of 
Physics and its laws. However, what it does say, is that in a relativistic reaction 
particles produce particles and only particles. This is not a self evident fact and 
shows that our belief of how a relativistic reaction occurs is justified by the 
mathematical model we employ. 

Because four-momentum is a relativistic physical quantity, we demand that 
during a relativistic reaction the total four-momentum is conserved, that is, the 
following equation holds: 


P=’. (10.1) 


The understanding of this law is different from the usual point of view of the 
corresponding law of conservation of 3-momentum (and energy) of Newtonian 
Physics, the difference being in the absolute character of the particles of the latter. 
Indeed, in Special Relativity it is assumed that during the reaction, the reacting 
particles form a new unstable (or virtual) particle with four momentum P! which 


Relativistic Reaction, 


Particles Particles 


Fig. 10.4 Relativistic reaction 


308 10 Relativistic Reactions 


subsequently, with some kind of esoteric mechanism, brakes up into the products 
of the reaction. This virtual particle we call center of momentum particle and 
indicate as shown in Fig. 10.5. Such an assumption is clearly incompatible with the 
Newtonian point of view. 

The law of conservation of four-momentum is a pillar of contemporary physics 
and appears to hold exactly without exception. For example, an apparent violation 
in radioactive beta decay back in the 30’s led Pauli to the speculative hypothesis 
that in that reaction there is also an undetected neutral particle of small proper mass, 
which was later named “neutrino”. This speculation was established! twenty three 
years later, beyond any doubt, by its detection in a nuclear reaction. 

Geometrically the law of conservation of four-momentum in spacetime is 
represented as in Newtonian 3-space, that is, with two closed polygons with one 
common side (the four-vector P‘) the rest of the sides being the four-momenta 
Pi, P2,---, Pn and qi, 92,---, Gm as shown in Fig. 10.6. 

Algebraically, this law is described by the equation: 


Pitpht...tph=aqtat...t+ah- (10.2) 


Fig. 10.5 The center of ; 
momentum particle Im 


1 
Pn 


Fig. 10.6 The closed 
polygon of linear 
four-momenta 


'C.L. Cowan Jr and F. Reines Phys Rev 92, 830 (1953); F: Reines and C.L. Cowan Jr Phys Rev 
113, 273 (1959). 


10.4 Working With Four-Momenta 309 
10.3.2 The Relativistic Triangle Inequality 


Let us consider now the second geometric result of Proposition 1.11.1 concerning 
the relativistic triangle inequality. Let OA', OB' and AB’ be the position four- 
vectors of three spacetime points O, A, B in the interior of the (future) light cone. 
Then for the spacetime triangle (O AB) it is true that the Lorentz lengths (not the 
Euclidean lengths!) satisfy the relation: 


(Lorentz length of OB) >(Lorentz length of OA)+ (Lorentz length of AB). 


This relation is opposite to the one of Euclidean geometry and can be extended 
easily to any spacetime polygon, whose sides are timelike and/or null four-vectors. 
The physical meaning of this geometric result is the following: 

Consider a closed spacetime polygon with sides the four-momentum vectors Py 
(A = 1,2,...,n) and let P' be their sum. The length of each side equals mac, 
where m 4 is the mass of the particle A with four-momentum py (A =1,2,...,n). 
The geometric result implies that the mass of the center of mass particle is more 
than (and in extreme cases equal to) the sum of the masses of the individual reacting 
particles. Because the mass is directly related to energy, this result can be understood 
as follows. 

A relativistic system in every state consists of a set of particles and has total 
energy E, which includes the masses of the constituting particles, their kinetic 
energies and the energy of the dynamical fields among the particles: 


[Mass of the system] =[Masses of particles] + [Kinetic energy of particles] 


+ [Energy of dynamical fields]. 


At every moment, these three types of energy are in dynamic equilibrium so that 
it is possible that e.g. to reduce the mass of the individual particles while increasing 
the kinetic energy in order to keep the total mass of the system constant. This 
phenomenon happens in nuclear reactors (e.g. in the sun), where a part of the mass 
of the system is transformed into kinetic energy of the products which, accordingly, 
is dissipated as radiation and heat. 


10.4 Working With Four-Momenta 


Having given the basic definitions and notions concerning the four-momentum we 
continue with practical examples, which show how one works in practice with four- 
momenta. 


Example 10.4.1 Show that there are no relativistic reactions, whose product will be 
a single photon. Also show that the photon does not decay. 


310 10 Relativistic Reactions 


Solution 

Assume that particles 1, 2 with masses m,, m2 respectively interact and produce 
a photon according to the reaction 1 +2 — y. 

Conservation of four-momentum gives: 


Pi + Ph = Pi 


2 


Squaring and using the relation p! p; = —m*c* we find: 


— mic? - myc? + 2 pi pri = 0. (10.3) 


Relation (/0.3) contains invariant quantities, therefore its value can be computed in 
any LCF. We choose the proper system %j of particle 1, in which pj = O and let 


i / 
= i) . Then relation (/0.3) gives: 
Po / >, 


(m5 + m5)c? + 2m, E}, = 0. 
In this relation all terms are > 0 therefore it vanishes only if each term vanishes. 
The vanishing of the first term gives m,; = m2 = 0, which means that the two 


reacting particles must be photons. 
Conservation of energy in & gives: 


E\+ Eo = £3 
Pi + P2 = Ps 
Squaring the second and using relation p; — Ee (i = 1,2,3) for the reacting 


photons, we find that one of the energies E,, E2 must vanish which is impossible 
(since the angle between the colliding particles will be 180°). We conclude that a 
single photon cannot be the result of a relativistic reaction 

Considering the reaction in the equivalent form: 


Ya) +Ca 


we conclude that a photon cannot decay. 


The following example shows the connection of the relativistic law of conser- 
vation of four-momentum with the conservation of energy and 3-momentum of 
Newtonian Physics. 


Example 10.4.2 


a. Show that if the time component of a four-vector vanishes in all LCF, then the 
four-vector is the zero four-vector. (This result is known as the Theorem of the 
zeroth component). 


10.5 Special Coordinate Frames in the Study of Relativistic Collisions 311 


b. Show that in the Theory of Special Relativity the conservation of energy implies 
the conservation of 3-momentum and conversely. This result shows that the 
Newtonian laws of conservation of energy and 3-momentum are contained as 
a set in the conservation law of four-momentum. 


Solution 
a. Consider the four-vector A’, which in the LCF ¥ and &’ has components A= 
CG ) and Ai = ° respectively. If the relative velocity of &, D’ is B, 
A x A D! 


then the Lorentz transformation gives: 


0=y(0—B-A) 
A’ = y(A—B-0). 


from which follows: 
p-A=0O, A’= yA. 


Because £ is arbitrary, the first relation gives A = 0 and then from the second 
relation follows A’ = 0. In both cases we have A‘ = 0. 

b. Let P! be the four-momentum of a physical system, which undergoes an 
interaction. Let AP! be the change in the four-momentum of the system during 


this interaction and let an arbitrary LCF ¥, in which the four-vector AP! has 
AE 
components AP’ = ( x ) . If the energy is conserved during the interaction 
p 
then AE = 0 in all LCF hence from the Theorem of the zeroth component 
follows AP' = 0 => Ap = O, that is the 3-momentum of the physical system is 
also conserved during the interaction. Working in a similar manner we show that 
Ap=0=> AE=0. 


10.5 Special Coordinate Frames in the Study of Relativistic 
Collisions 


The quantitative study of a relativistic reaction involves calculations with compo- 
nents, therefore it requires the decomposition of the four-vectors involved in one 
LCE. In practice, we use the following three LCF: 


e The proper frame of the center of momentum particle. This frame we call the 
Center of Momentum (CM) frame and denote as * or ©”. 

e The proper frame of one ot the reacting or the product particles (excluding of 
course the photons). These LCF we call target frame of the relevant particle. 


312 10 Relativistic Reactions 


e The Laboratory frame, which as a rule coincides with one of the target frames 
but in the case of colliding beams, it can be the proper frame of the measuring 
apparatus. The Laboratory frame we denote with (L). 


It is useful to compute the B, I’ factors of the CM frame in the L frame (or any 
other frame we desire). To do that we decompose the four vector P! in both systems: 


eA) a) 

= = L : 

9 Jom P L 

From the relation EX = MT c’, or from the time component of the Lorentz 
transformation, we have: 


EL 
r=. (10.4) 


The space component of the Lorentz transformation gives: 


cP 


In words the above relations say [(see also (9.10)]: 


__ Total 3-momentum of the system of particles in L (10.6) 


Total energy of system of particles in L 


re Total energy of system of particles in L 


, (10.7) 
Mass of CM Particle 


In order to compute the quantities B, I in terms of the energies and the 3- 
momenta of the reacting particles, we decompose in the lab frame (or in any 


’ Be 
other frame we desire) the four-momenta of the particles p', = ( Al °) A= 
ra 


‘A 
1,2,...,n and have the relations: 
BS) fa. P'S > pi (10.8) 
A A 
We recall that the four-momentum of a photon is an = B (;) , where é is 


= 
the direction of propagation of the photon in the LCF & and E is the energy of the 
photon in &. 


10.6 The Generic Reaction A+ B—> C 313 
10.6 The Generic Reaction A + B > C 


Let us consider the reaction 
14+24+...374274+.... 


According to our previous considerations this reaction expresses two different 
appearances of the same physical system. Furthermore, the particles themselves can 
be complex physical systems consisting of more particles. This point of view allows 
us to consider the reaction 


A+B>C 


where A, B are hypothetical (virtual) “particles” which consist of some particles, 
1,2,... and C is the particle 1’ + 2’ + .... With this approach we can think of any 
relativistic reaction as a sequence of reactions of the form A + B — C. We remark 
that the reaction (decay) A ~ B+ C is the same with A + B — C because it can 
be written as A + (—B) — C. We conclude that the reaction: 


A+B>C 


is the generic reaction in terms of which all reactions can be studied. In the 
following, we study the generic reaction and apply the results to specific problems. 

In order to facilitate the writing of the results and show their cyclical property, 
we introduce the following notation. Let the physical quantity A, which refers to 
the particle N in the LCF &. We shall write: 


N 
yA. 


For example, the energy of particle 1 in the proper frame X> of particle 2 is written 
as 5 E . If the LCF & is obvious it shall be omitted. For the CM frame &* we shall 


write syle or !E*. 


10.6.1 The Physics of the Generic Reaction 


Consider an arbitrary LCF &, in which the four-momenta of particles A, B, C are 
decomposed as follows: 


Ge a) co) 
sp /) \ 3p 7 \ Sp sy 


314 10 Relativistic Reactions 


Conservation four-momentum gives: 


=) ew Pe) 
>») + {2 =[{2 10.9 
( SP p SP asi 


or in terms of components: 
sE+R ESSE (10.10) 
EP+EP =5 P. (10.11) 


To these equations we must add the corresponding equations, which define the 
masses: 


pE=/bptc2+ mich I= A,B,C. (10.12) 


We end up with a system of seven equations in 15 quantities. Usually, the masses 
are assumed to be given, therefore the eight unknown quantities reduce to five. As 
we show below, the masses suffice in order to fix the energies and the lengths of 
3-momenta, therefore the remaining five unknowns involve the directions of the 3- 
momenta. 

In order to compute the energies in terms of the masses we square (10.9) and 
evaluate the Lorentz product of the four-momenta in the proper frame of one of the 
involved particles, as we did in Example 10.4.1. We have: 


— mc — mec? +2 (-4,£ £ Ele +4. p-£,P) = —mec’. (10.13) 


In the proper frame X, of particle A we have $, Pa = 0, $,£ = mac’ and let 


b= (22°) then ives: 
Pp= 4 . Then (10.13) gives: 
y,P LA 


2 2 2 
mA—-m,—-—m 
BE=—¢ _A 8B ,? (10.14) 
7 2m 4 


In order to compute the energy Se E, we simply interchange in (10.14) A, B and 
find: 


(10.15) 


Finally, in order to compute the energy $0 E, we write the reaction as: 


A+(—-C) > (-—B) 


10.6 The Generic Reaction A + B > C 315 


and apply (10.14) with the following change of letters: 


AscA, Be-C,C<-B 


namely, 


PA <* PA; PB <* —PC, PC <* —PB- 


We write the result leaving the squares of the four-momenta (the masses) the same 
and changing the sign of the energy: 


2 2 2 
C —Mmat+m,+mMe 5) 
E= ; 10.16 
2A 2m, . ( ) 


We note that the relations giving the energies are cyclical and independent of 
the nature of the particles, therefore there must exist a purely geometric method for 
the calculation of the energies. This observation leads us to look for a geometric 
description of the reaction as a collision of four-vectors, a point of view which we 
shall develop in Sect. 17.3. 

Concerning the computation of the length of the 3-momenta, we have: 


m2, — m, — m3, : 
B 2 2 a cA c A B 24 
Is, P pc? = — Zz — mc mRC 
2mMA 
cA 
—z (ma +mp+mc) (ma +mpB-—mc) 
~ 4m 


(ma —mgt+mc)(ma—mg-—me). 
It follows: 


IZ, PI= Ty hmarma, mc) (10.17) 


where the function A is defined by the relation: 


A(m,, ma, Me) =V/(ma + mg + mc) (ma +mpB— mc) 


JV(ma — mg +mc) (m4 — mg —mc) > 0 (10.18) 


and has dimensions [M 278 T°}. The function A(m,4,mg,mc) is symmetric in all 
its arguments, therefore the only difference between the length of the 3-momenta of 
the particles is the mass in the denominator of relation (10.17). The function A is 
characteristic in the study of the relativistic reactions/collisions and it is called the 
triangle function. 


316 10 Relativistic Reactions 


Exercise 10.6.1 Jn general, the d function for three variables is defined as follows: 


rx2, y2, 22) = V(x + y2— 2)" — 4x2 y2, (10.19) 


Show that the function d(x*, y, 27) has the following algebraic and geometric 
properties. 


A. Algebraic 
1. 


Max?, y?, 27) = A(z*, 7, x7) = ACy?, 27, x”) (10.20) 


Mx?, y?, 2) = [x4 + 4 + x4 2x2y2 — Qy2z2 — 2722 (10.21) 


A(x, y?, 27) = j2 —(y+2z)2 j2 —(y—2z)2 (10.22) 
M(x?, y?, y?) = y/x2(x? — 4y?) (10.23) 
A(x, y?, 0) = IIx? - y*| (10.24) 


2. A(x?, y?, 27) > Oifz > x + y (relativistic case) and d(x”, y*, 27) < 0 if 
zZ <x + y (Newtonian case) 


B. Geometric 
The quantity iv —A(x2, y?, 22) equals the area of a Euclidean triangle with 
sides x, y, z. The negative sign is needed because in the Euclidean case z < x+ 
y and the function d(x*, y?, 22) < 0. It is this property that gave 4(x?, y*, 27) 
its name because iv —)(x2, y?, 22) is the formula of Heron? for the area of a 


triangle in terms of its sides (see Example 17.5.1). 


One could put forward the question: How much information is contained in the 
above equations? To answer this question we consider an arbitrary LCF &, which is 
not the proper frame of one of the particles A, B, C. Then relation (10.13) allows 
us to compute the (Euclidean) inner products of the 3-momenta of the particles in 
= (note that these products are not Lorentz invariant, therefore they depend on the 
LCF &!). The inner products give the (Euclidean) angles among the 3-momenta 
in &, hence the triangle of the 3-momenta is fully determined in &. In case & 
coincides with the proper frame of any of the particles A, B, C, then the triangle of 
the 3-momenta degenerates into a straight line segment. We conclude that assuming 
that the masses of the particles are given, we have the following information in an 
arbitrary LCF &: 


1. Complete knowledge of the triangle (or straight line segments) of the 3-momenta. 
2. Complete knowledge of the energies of the particles. 


2 (See http://mathpages.com/home/kmath196.htm) 


10.6 The Generic Reaction A+ B > C 317 


Fig. 10.7 Representation of the reaction A + B > C 


Therefore the remaining five parameters concern the positioning of the triangle 
of the 3-momenta in the 3-space of & (more precisely in the momentum space of 
X). This positioning requires three parameters for fixing one of the vertices of the 
triangle and two parameters (two angles) for the determination of the orientation of 
the plane of the triangle. 

A different way to study the effect of the masses on the “internal” structure of 
the system of particles is via the geometric representation of the relation E* =| p |* 
c? + m*c*. In the proper space of one of the particles it is easy to check the validity 
of the triangles of Fig. 10.7. 

We note that the masses m4, mg fix the point Z of the projection of the vertex K 
of the Euclidean triangles (K ZH) and (K ZD), the height in each case being fixed 
by the function (m3, mr, me) and the masses m,, mg. Consequently, the three 
masses fix the figure completely and what remains is its positioning in the three 
dimensional Euclidean space of the particle C. 

The following example shows the application of the above discussion in practice. 


Example 10.6.1 Show that in the decay 1 > 2 + 3 the factors 62, 63 in the CM 
are constant, depending only on the masses of the particles and the energy in the 
CM. 
Solution 

Relation (10.17) gives the length of the 3-momenta of particles 2, 3: 


2 2 2 
A(m{,m>, EU 


* ye * ye 
| Po [=| p3 |= OM 


where M is the mass of the CM particle (i.e. particle 1). The energies of the particles 
2, 3 in the CM frame, according to equation (10.16), are: 


2 2 2 
_ M*+my—m3 4 


a ae 
EE : M*—m+m3 4 
a 2M ; 


E3 = oar 


318 10 Relativistic Reactions 


From equation (10.6) the factor 63 of particle 2 in the CM frame is: 


2. 2D 
Bs _ | P> | c —_ A(my, Ms, m3) 
2 x 2 2° 

E} M? + m3 — m3 


(10.25) 


We note that 8; depends only on the masses of the particles and the mass of the CM 
particle, which in the CM frame equals the energy of the particles in that frame. We 
work similarly for particle 3. 


10.6.2 Threshold of a Reaction 


An interesting limiting case occurs when the triangle of the 3-momenta degenerates 
to a straight line segment. This happens when the height of the triangle vanishes, 
that is, when the function A(m4,, mM, me) = 0. In this case, all the three particles 
A, B, C are at rest in the proper frame of C (which is also the CM frame) and we 
say that this limiting case is the threshold of the reaction. Conservation of energy 


in the proper frame of C gives: 
$.E + fE _ mec >mc=mat+msB 


where equality holds only at the threshold of the reaction.’ This result is the physical 
explanation of the triangle inequality we mentioned in Sect. 10.3 or, equivalently, 
the geometric interpretation of the function 4, as the area of a Euclidean triangle 
with sides m4, mp, mc. In practice, this means that in case the two particles collide 
totally inelastically, so that after collision they become one particle, their kinetic 
energy (in the proper frame of the one daughter particle) transforms into mass of 
the new particle. Conversely, when a particle decays, a part of its mass becomes 
kinetic energy of the produced particles. At the threshold of a reaction, there is no 
transformation of mass into kinetic energy. 


Exercise 10.6.2. Show that the necessary and sufficient condition for the particles 
A and B of non-zero mass to react at the threshold is: 


i i 
Dy Pp a 
MA MB 


where u' is the four velocity of the particles. 


3The vanishing of the function A(m2, m2,, m2.) implies the condition (m2. + m2 — m2)” — 
gs Ar MB, Mc p Cc A B 

4m2.m>, = 0, from which follows mc = mg + mg. But the function 4 is symmetric in all 

its arguments, therefore the result must not change if we interchange the masses m4, mg. This 


property selects only the case mc =mg+ma. 


10.6 The Generic Reaction A + B > C 319 


Exercise 10.6.3 Consider the case ma = mg = m and show that in this case the 
triangle of the reaction is isosceles with height (=length of 3-momentum) | p |= 


al me — 4m? and common side (=energy) E = mec? Deduce that at the threshold 


of the reaction, the mass m = ae and the triangle degenerates to a straight line 
segment. 


In case one of the particles A, B is a photon, only one of the triangles 
(K ZH), (KZD) of Fig. 10.7 survives. We infer that it is not possible that both 
particles are photons, because in that case there does not exist a triangle (see also 
Example 10.4.1). 


Exercise 10.6.4 Assume that one of the particles, the A say, is a photon and that 
mp # 0. Show that in this case, the energies of A, B in the proper frame of C are 
the following: 


2 2 > 2 
A mc — Mp B mo + mea 

E = —— = — 10.26 
Ue 2mc me 2mc ( ) 


What can you say about the lengths of the 3-momenta without any further 
calculations? 

Consider that the mass mg — 0 and show that the mass cannot take the value 
zero, because then the sum of the angles of the triangle of reaction becomes greater 
than 21. Conclude that it is not possible one particle to decay to two photons or 
two photons to react and produce a single particle (the decay in two particles is 
possible!). — 


The threshold energy of a reaction is the energy of the bullet particle in the CM 
frame at the threshold of the reaction. The threshold energy is different for different 
bullet particles. Let us see how the threshold energy is computed in practice. 


Example 10.6.2 Consider the reaction z~ + p > K°+ A°, where the proton rests 
in the laboratory. 


a. Compute the threshold energy of the pion in the laboratory 

b. In an experiment, in which the pions have 3-momentum 2.50GeV/c, it is 
observed that the A° particles have momentum 0.60 GeV/c and they emerge at 
an angle of 45° with the direction of motion of the pions. Compute the y* factor 
of the CM frame. 

c. Compute the 3-momentum of the kaons K® in the laboratory frame and in the 
CM frame. 


It is given that: m,- = 140 MeV/c’, mp = 938 MeV/c”, mgo = 498 MeV/c”, 
myo = 1116 MeV/c’. 


320 10 Relativistic Reactions 
Solution 
a. The reaction is: 


ao pm Koo AO 
1 2 3 4 


Conservation of four-momentum gives: 
Pi + P= P3t Pa. 


At the threshold of the reaction at the CM frame the total 3-momentum of the 
daughter particles vanishes. Therefore: 


(m3 + ma4)c 
pot oh = (4 , 
CM 


Squaring the conservation equation we find: 


— mic? — mic? + 2p! po: = —(m3 + ma)?c?. (10.27) 


The term Pi p2i 1s invariant, therefore it can be computed in any LCF. We choose 
the lab frame, where p is at rest and assume that at the threshold we have the 


components: 
; Ey ) ; a 
i : i 
Pi = »P,= : 
: ( Prh L 0 Lt 


Replacing in (10.27) and solving for E;,:,we find the threshold energy of the 
pion in the laboratory: 


(m3 + ma)? — mt _ ms re) 


E = 
1,th 2m 


A second solution, which is based in the previous considerations, is the following. 
The energy of particle | in the lab frame is (see (10.14): 


2 2 2 

i Sy Ss. 

E = 
2m2 


where M is the mass of the CM particle. At the threshold of the reaction: 


M=m3+m™4 


10.6 The Generic Reaction A+ B—> C 321 


from which follows: 


1 (m3 + m4)” — mj — m5 2 
Eth = Cc. 
LE 2m2 


b. In the experiment, the four-momentum of the CM particle in the lab frame is: 
E/c 
Pho = vi + ph = ( : ) 
fs 
where: 
E = E, + Ey = (pic? + mict + moc~ = 3.44GeV 
P=pi + p2 = 2.50iGeV/c. 


The 6* factor of the CM frame in the lab frame is given by: 


|Plc 
= —— =0.727 
E 


p* 


from which we compute y* = 1.456. 
c. Conservation of 3-momentum in the lab frame gives: 


P3 = pi + p2 — pa = 2.501 — 0.60(cos 451+ sin 45 j) = (2.08i—0.42 j) GeV/c 


from which we compute | p3 |= 2.12 GeV/c and 


—0.42 
tan 03 = ges —0.2019 > 63; = —11° 24’ 57”. 


The energy of K° in the lab is: 


E3 = \/p3c? + m3ct © 2.18 GeV. 


In order to compute the 3-momentum of the kaons in the CM frame we use the 
Lorentz transformation, which connects the lab and the CM frames. We have: 
P3x = ¥"(p3x — B*E3/c) 
Dp 3y = P 3y 


E3/c = y*(E3/c — B* p3x). 


322 10 Relativistic Reactions 


Replacing we compute: 
p3 = (0.721i—0.42j) GeV/c. 
It follows that the length | p} |= 0.834 GeV/c and the angle: 


—0.42 


——_— = — 00,582. a 0 13’ 19”. 
0721 0.5825 = 63 30° 13° 19 


tan 03 = 
Example 10.6.3 A particle A of mass m; 4 0, which in the lab frame has energy 


E,, decays in two particles B, C with masses m2, m3 respectively. 


a. Show that mj > m2 + m3, where the equal sign holds at the threshold of the 
reaction. 

b. Calculate the range of energies of particle A (in the lab) and the conditions that 
the masses must satisfy in order the reaction to be possible. 


Solution 


a. The reaction is: 


A>B+C 
1 2 3 


Conservation of four-momentum pi = pi, + oe in the proper frame of A, X4 


gives: 
c Cc Cc 
my, (;) =m2yY21 ( ) + M3731 ( ) 
ZA Y27 34 Beer 


where 721, 731 are the y—factors of particles 2, 3 in X,4. From the zeroth 
component we have: 


m, = m2y21 +m3y31 = m2 + m3. 


The “=” holds when y2; = y3; = 1, that is at the threshold of the reaction. 
b. Square the conservation equation to get: 


- myc = —mic* - myc? a 2p p3i- (10.28) 


We compute the product pi p3i in the lab frame. We have: 


i E\/c E3/c E\ E3 
Pi pai = ( us ( ue) = —— = + pip3cosé 
Pl L p3 L Cc 


10.6 The Generic Reaction A + B > C 323 


where p; = |pi|, p3 = |p3| and @ is the angle between pj, p3 in the lab frame. 
Replacing in equation (10.28) we find: 


1 
E\ E3- c” pi p3 cos @ = 5m + m3 _ m3)c4 => 


cp1 cos 6,/ E3 — m3c4 = —Xc* + E) B3 (10.29) 


where X = 5 (mi + m3 = m3)c? > 0. Squaring we find: 
op, cos” 6(E3 — myc’) =e Ges jog oe _ 2X E\ E3 => 
(E; - op cos” OE; - 2Xc7 E\ E3 Kee 4 c§ pms cos’ 6 = 0. 


This is a quadratic equation in £3, therefore obtains the extreme values when the 
discriminant A = 0. We compute: 


A = (Xe? E1)* — (E? — c? p? cos” 9)(X7c* + c° p2m}, cos” 6) 
a ba rac _ [EIx?¢* + (cE; pim3 _ cp? X’) cos” 6 — &§ pims cos* 0] 
and find the equation: 
[<?pim3 cos* 6 — (Eim3 _ x?)| cos’ 6 = 0. 
The roots of this equation are: 


Ejm3 — X* 


2d AD 
m3 pyc 


cos @ = 0, cos’ 9 = 


We examine each root separately. 

When the first root cos @ = 0 is replaced in the original equation (10.29) gives 
E, E3 = Xc”, hence E, £3 is an invariant. 

The second root gives: 


7) 2 22, 2 
cos* 6 = Ey = =1 dala = > 
329 22-29 2 2.2 0. 
pye m3 pyc Py M3 pyc 
2 2.2 
sin? @ = as a 
= 20 ong 2 


324 10 Relativistic Reactions 


Therefore, we have the condition: 


xe myc x? 2.4 yo) 
VS oy op eS pe <pic > 
m3 Pye Py m3 
xX 
0 < — mjc < Ej —mic* 
m3 


The left-hand side gives the following condition for the energy: 
Xe mimc* [0s 
(X — mym3c?)(X + mim3c*) = 0 
((my — m3)” — m3)((m + m3)” — m3) > 0. 
Because m, > m2 and m, > m3, this inequality has the solution: 
m, = m2 


(as well as the condition m,; > m2 + m3, which we already know). The other 
condition for the energy of the mother particle is: 


2 x? 
Ej>= z>h2 
m 
3 


2 2 2 
|X| = Elite, =) -g 


m3 2 m3 


Because in the proper frame of the mother particle, the daughter particles in 
general have energies E2 > mc, E3 > m3c2, it must be true that m,; > 
m2 +m3, which we already know. 

(Examine the case that one of the daughter particles is a photon). 


Example 10.6.4 In the generic reaction A + B — C define the invariants: 


1 f . 1 ; ‘ 
SAB = (Pt py)? tAB = aa (Pa — Pe). (10.30) 


The first of these invariants gives the mass of the CM particle and the second the 
3-momentum transfer (equivalently the percentage of energy of the system, which 
is transformed into kinetic energy of the CM particle). Show that the quantities 
SAB, tap attain their extreme values simultaneously and that this is happening at the 
threshold of the reaction. Also show that in that case the mass of the CM particle 
has its minimum value m4 + mp. 
Solution 

We note that the sum s4g+tap = + [(p',)? + (p',)”| — —m>, —m,= constant. 
Therefore, when s4g is maximum, t4g becomes minimum and conversely. Because 
SAp is an invariant, its value is the same in all LCF. We compute it in the proper 


10.7 Transformation of Angles 325 


frame of the particle B. In that frame, the inner product By PB = —4Em B, 
therefore s4g = —m>, _ m _ 208. mg. The maximum value of the right-hand 
side occurs when aE = mac’, namely at the threshold of the reaction. For this 
energy, the value s4B,max = —(ma +m By and the corresponding minimum value 
of the quantity t4, at that energy is t4g.min = (M4 —m B). The last question is left 


to the reader. 


10.7 Transformation of Angles 


In the last section we have shown that, given the masses, the remaining unknown 
elements of the generic reaction are (a) the angles between the 3-momenta of the 
reacting particles and (b) the orientation of the plane of the triangle of the 3- 
momenta in the LCF, in which the reaction is studied. Because in practice the 
angular data in general are given in different LCF, it is necessary to study the 
transformation of angles under the Lorentz transformation. This study will be done 
between the CM frame (X*) and the Lab frame (=“), because these are the frames 
mostly used in practice. 


Exercise 10.7.1 Consider a particle with four-momentum p', which in the LCF &* 
and &" has components: 


i ol ea, 
: =( p’ sh p* ae 


Assume that the velocity factor of 2* in XS" is B and show that between the 3-vectors 
p’ and p* the following relation holds: 


p’-B 
p2 


pe =prt(y-) B+yE*B. (10.31) 


Furthermore, show that the factors B" , B* of the particle in the frames 2" and &* 
respectively are given by the relations: 


cPY pe _ oP 


(i 
Be = 50° ~ Ee 


(10.32) 
Using relations (10.31) and (10.32) we are able to compute the components of 
the four-momentum in ©/ when they are known in D* and conversely. 
Without loss of generality (why?) we direct the axes of ©* and ¥/ in such a way 
that they are related by a boost along the z—axis with velocity factor 6. Then the 


326 10 Relativistic Reactions 


Lorentz transformation gives: 


E*/c = y(E"/c — Bpr) 


Py = Py 
B= Py (10.33) 


pi =y(py — BE" /c). 


We consider in each of the frames ©*, © spherical coordinates (see Fig. 10.8) 
and have the relations: 


pk = P' sino’ coso” p* = P* sin6* cos ¢* 
Py = P' sind" sing! py = P* sin6* sin 6* 
pr = P* cose” pz = P* cos. 


In the new coordinates the equations of the transformation (10.33) are written as 
follows: 


E*/c = y(E"/c — BP" cos6") (10.34) 

P* sin 6* cos ¢* = P* sind” cos 6” (10.35) 
P* sin 6* sing* = PX sind” sing” (10.36) 
P* cos0* = y(P* cos6! — BE*/c). (10.37) 


Dividing (10.35) by (10.36) we find: 
tang” = tang”. (10.38) 
Because 0 < ¢/, o* <x follows: 


= ¢*. (10.39) 


Fig. 10.8 Spherical 
coordinates in momentum 
space 


Sy 


10.7 Transformation of Angles 327 


Then the equations of the transformation are written: 


E*/c = y(E"/c — BP" cos6") (10.40) 

P* sing* = P¥ sino! (10.41) 
¢* =o! (10.42) 

P* cos0* = y(P* cos6 — BE" /c). (10.43) 


In order to derive the transformation between the angles 64, 6* we divide 
relations (10.43), (10.41) and find (compare with (9.38)): 


L 
coté* =y (cotoe — — (10.44) 
singL 


where the quantity r/ is the quotient of the 6 factor of &/ and X*, and BY is the B 
factor of the particle in the lab frame: 


pa. (10.45) 


We have the following result. 


Exercise 10.7.2 Show that when r= > 1, the angle 6" has a maximum, which 
occurs for the value: 


Z -1 1! 
Omax = COS =r (10.46) 
and equals: 
* TK 
sindh = pe (10.47) 
max BLTL 


where [* = 1/1 — B* is the y factor of the particle in X*. 


Exercise 10.7.3. Using the inverse Lorentz transformation, show that the following 
relation holds: 


r*® 
coto” = y ( coté* + — (10.48) 
sin 6* 
where the quantity r* is defined as follows: 


fie (10.49) 


328 10 Relativistic Reactions 


Show that when r* > 1, the angle 0* has a maximum for the value: 


* —l 1 
Omax = COS a (10.50) 
and this maximum equals: 
any 
sin OF ax aaa (10.51) 


It remains to compute the length P“ =| p” | of the 3-momentum of the particle 
in the laboratory frame ©” in terms of the components of the four-momentum in 
the CM frame &*. From the transformation of energy — see (10.40) — we have: 


yE" = E*+yBcP* cosé”. 
Replacing in this E = J (PL)2c2 + m2c4 and squaring we find: 
y*(1—B* cos? 64)(cP")*—2By E* cos 0° cP" +y?mc4—(E*)* =0. (10.52) 
The term: 


1 
1 — B* cos” 0" = 1 — B7(1 — sin? 6”) = —(1 + B’y’ sin’ 6") 
y 


and similarly the term: 
y2mict — (E*)? = y?2m2c4 — (cP*)? — mt = —(cP*)* + By? 
Replacing in (10.52) we find a quadratic equation in terms of P“: 
(1 + B?y? sin? 6“) (cP)? — 2By E* cosé/ cP" — (cP*)? + B*y*m*c* = 0 
(10.53) 
whose solution is: 
1 


1 
1+ p2y? sin? o£ 


(10.54) 


where: 


X = ByE*coso" + /@rE* cos 64)? — (1 + B2y? sin? 0£)(B2y2m2c4 — (cP*)?). 
(10.55) 


The term in square root can be written: 


y? [(cP? — B2y2m2c4 sin? 6] 


10.7 Transformation of Angles 329 


Therefore, finally, we have: 


1 


Se 
1+ B2y? sin? 64 


|b e* cos@/ + y(cP*? — B2y2m2c4 sin? 64 ; 
(10.56) 


From relation (10.48) one computes sin”, cos 6“ in terms of the trigonometric 
functions of 6*, therefore relation (10.56) gives the 3-momentum of the particle in 
the lab frame ze if the four-momentum is known in the CM frame &*. 


10.7.1 Radiative Transitions 


One important application of the results of the last section occurs, when one of the 
produced particles is a photon. These reactions are of the general form*: 


A> B+y 


where A, B are some sets of particles or a single particle. Such reactions are the 
excitation and de-excitation of atoms, various decay reactions of particles (e.g. 
SiS AL ee Sy vy) etc. The study of these reactions is done in the 
CM frame, which coincides with the proper frame of the mother particle, or in the 
lab frame, in which the mother particle A has 3-momentum p. The 3-dimensional 
schematic representation of the reaction is shown in Fig. 10.9. 

Without making any further calculations, we use relations (10.14), (10.15), 
(10.16), and (10.17) and write the energies of the particle B and the photon in the 
CM frame (which is the proper frame of the particle A): 


2 2 2: 2 
my — Mz 9 B ma +Mep 9 
'E = ean ‘ AE = ta (10.57) 
MA MA 
vp —B p— _© (m2 m2,0) =” E/ ma — Me (10.58) 
— = my,,Mp, = c= c. - 
A A Sins A: MB nis 
Fig. 10.9 Decay in the lab we 
frame and in the CM frame we 
7 7 OF 
ke ety Gh aes a ec oc ens. ea ah Seti 


o_—- --&.---- d) Se Gy 
ye aro yy 


4ma % 0 see Example 10.4.1. 


330 10 Relativistic Reactions 


From the above relations it is clear that for the reaction to be possible, the mass of 
the mother particle A must be greater than the mass of the daughter particle B. The 
difference of the masses Am = m, — mg we call proper mass loss and equals 
the kinetic energy of products in the CM frame. Indeed, we calculate for the kinetic 
energy of the particle B: 


2 


4 2 
= >— (ma— mes) 
2mMA 


B B 2 
4l =, E—mpge 


and for the photon: 


2 2 
Vest pe MAT NB 8 
A A Ima 


Therefore, the total kinetic energy 4 Tjorai of the products in the CM frame is: 


2 
Cc 
aTiora =8 T +. T= a [oma — mp)’ +m, - m3 | = (m4 — mg)c’. 


(10.59) 


Finally, concerning the factor z B of the particle B in the CM frame we have: 


B 2 2 
Pc m,—m 
bp=4— = 44. (10.60) 
AE m4, +m, 


In order to compute the corresponding quantities in the laboratory frame we apply 
the boost relating the two frames with velocity factor a8 = Ba of the particle A in 
the lab frame. For particle B we have: 


Br=ya (Be + Ba i Pecos6"). 
But BPG = Pc =) E, therefore: 
Bray, (E+ Ba 7, E cos6*). (10.61) 
For the photon: 
TE =YVA CLE - Bc’, P cos 0*) =YVA 'E (1 - Bcos6"*). (10.62) 


Using the energy e E we compute the 3-momentum: 


10.7 Transformation of Angles 331 


while for the photon we have: 


It remains to compute the angular quantities of the daughter particles. In order to 
compute the total angle 6 + ¢ in the lab frame, we square the conservation equation: 


mc? = mc? — 2 (-2z TE +f PY Pc? cos(6 + $)) 


and solve for cos(@ + @). The result is: 
m2, — m2 = 
cos(9 + ¢) = er (GPpe) 


This relation calculates the angle 6 + ¢ in terms of the energies and the 3-momenta 
of the daughter particles in the lab frame. 

In order to compute the angles in the lab frame, we apply relations (10.48) and 
(10.49). For particle B we have: 


r* 
cotd” = y4 ( cote* + —4— 
sin 0* 


where r* = B4/ We . Similarly, for the photon we find: 


cot ot = ya (corp + "y ) = va (coro* - Ba ) 


sin d* sin d* 


The maximum angle between the direction of emission of the photon and the 
direction of motion of the mother particle in the lab frame is given by the relation 
(10.51): 


a 1 
Pp YE ya (1— Bcosé*) 


Example 10.7.1 An excited atom of mass m* makes a transition from the excited 
state to the ground state of mass m by the emission of a photon. Subsequently, this 
photon collides with a similar atom, which is in the ground state. Determine the 
condition on the energy of the second particle in the ground state in order to make a 
transition to the excited state. 
Solution 

The reactions are 1 > y+2 and3+y — 4 where 1, 4 are the atoms of mass 
m* in the excited state and 2, 3 are the atoms of mass m in the ground state. If we 


332 10 Relativistic Reactions 


denote by p;, 7 = 1,2, 3,4 the four-momenta of the corresponding particles, we 
have the conservation equations: 


Pi=PL +P, PL + Ph = Ph 


The energy of the photon in the proper frame of atom 1 is (see equation (10.16) 
or write the reaction in the form p| — P, = ps and square): 


14, = ——_— *". (10.63) 
Squaring the second conservation equation and using (10.63) we find: 
2pl, p3i = (m? — m*)c? = —2m* Ey. (10.64) 


Assume that in the proper frame % of particle 1 the four-momenta i = 


fy ( : ) ; p = ( E3/ °) where € is the direction of propagation of the photon 
e€ p3 
x1 x1 
in Xj. Then (10.64) gives: 


A 


1 * 2 
€- p3 = -(£3 — mc"). (10.65) 
Cc 


Let p3, be the component of p3 normal to é and p3) = 1(E3 — m*c?) the 
component parallel to the direction é¢. We have: 


2 2 2 2 42 2 2 
EB3 P3,C P31|¢ mC =0> 


2 2 2 22 
E3 — p3,c’ — m°c* > 0. 


Replacing p31 we find the required condition: 


*2 2 
ca a 
= 2m* , 


We conclude that the transition is possible only if the atom 3 has a non-vanishing 
speed in the proper frame of the initial excited atom |. To find the y —factor of this 
motion we write E3 = my3c” and have: 


m*2 4 m2 


> 
= 2mm* 


The corresponding 6 factor is computed to be: 


(m* — m)(m* +m) 


m*2 + m2 


p> 


10.7 Transformation of Angles 333 


To obtain a physical estimate of the result, we consider the excitation energy 
AE = (m* —m)c? and have: 


(AE/c?)(m* +m) 
m2 + m2 , 


p> 


If the excitation energy AE < mc”, we can approximate m* ~ m and the above 
relation reduces to: 


AE hv 


m mc 


where v is the frequency of the emitted photon. 


10.7.2. Reactions with Two Photon Final State 


The reactions with two photons final state> are important in practice and include the 
positron — electron annihilation et + e~ —> y + y, the two photon decay of the 
pion x° —» y + y, which is used in experimental particle research to recognize 
photons from z° or n° decay against a background of uncorrected photons. In the 
case of two photon final state, the results of Sect. 10.7.1 do not apply and we have 
to consider this case as a new one. 

We consider the reaction: 


A>yty 
1 2 3 


where A is the CM particle. For example, we consider the reaction e*+e— > y+y 
in the following two stages: 


ette ~>A>yty 


and assume that in the lab frame the CM particle A describes the system of the 
particles et, e~. This consideration is not obligatory but, as a rule, it is the one 
recommended, because it leads to the result with the use of the general results on 
the generic reaction A+ B > C. 

We consider the electron to be at rest in the lab frame when a positron of energy 
EF; (in the lab frame ) interacts producing two photons. Let us assume that in the CM 
frame the direction of propagation of photons creates an angle 6* with the common 
direction of e*, e~ (see Fig. 10.9). First we compute the various parameters of the 


Reactions with products v + y and v + v have not been observed and seem to be forbidden by a 
quantal conservation law. 


334 10 Relativistic Reactions 


CM particle and then the angle between the directions of the photons in the lab 
frame. 

We consider the first part of the reaction (i.e. e* + e~ — A) and have for the 
mass M of the CM particle: 


| 
Pi + Pai = Pa > PiPh = 5 (—M? +2m?) c?. 
Because the electron rests in the lab we have Pi P2i = —E m. Replacing we find: 


M=m/20+)y1) 


where y; = -. is the y—factor of the positron in the lab frame. 
Next we compute the £* factor of the CM particle in the lab frame. We have: 


gr — Pit pale _ mB yc" _ Biv _ y-il 
E, + mc? myc? +mc~  1+y1 ym+1 


* rs yit+il 
y= If 1— p? =). 


The energies E}, Ej of the photons in the CM frame are equal (to show this simply 
consider the projection of the 3-momentum of the photons normal to the direction 
of motion of the mother particle in the CM frame). Let E} = Ej = E* be the 
common value of the energy. To compute E*, we consider relation (10.14) for the 
right part of the reaction (i.e. A > y + y) and write: 


pee 2-0-0, _ Met on yt 
2 2M 2 Vy 2° 


To calculate the energies of the photons in the lab frame we consider the x axis 
along the direction of motion of the positron and use the boost from the CM frame 
to the lab frame. We find: 


Also: 


E EX E* 
ca y* (= +8" Pax) = joe (1+ B* cos *). 


c 
Similarly, for the other photon we compute: 


* 
ES jee (1 — B* cos 6*). 
c c 


10.7 Transformation of Angles 335 


We note that the ratio of the energies is: 


a = : (10.66) 
EE; y* (1+ B* cos@*) , 
E* 1 

= (10.67) 


E,  y*(1— B* cos6*) 
It remains to compute the angle 0 between the directions of the photons in the 


lab frame. For this, we write the four-momenta Ph, pi in the lab frame and compute 
the inner product Ph pai. We have: 


i_ £3 (1 i_ Fa(l 
P3 = Cc e3 Pa a= Cc e4 L 


ee (| er). 
P3P4i a ( + cos 


In the CM frame we have: 


: E*® QE 
P3P4i = “et + cos 7) = — = 


Equating the two results and solving for 6” we get: 


gL QE gl E*2 
2sin* — = => sin— = 
2 E3E4 2 E3E4 
We replace the ratios e, i from (10.66) and (10.67) and have the final answer: 
_ oF 1 
sin 7 = ‘ (10.68) 


y* (1 — B* cos? 6*) 


We note that when 6* = z/2 (that is, in the CM frame the directions of the 
photons are perpendicular to the direction of motion of the positron): 


oF | 2 
sin = = : (10.69) 
2 y* 1l+y 


In this special case, we also have E3 = E4 (why?) so that the photons are emitted 
in the lab symmetrically to the direction of motion of the positrons and with equal 
frequency (colour). Finally for the value y; = 3, the angle 6& = z/2 and the 
photons in the lab frame are propagating at 45° to the direction of motion of the 
positrons in the lab frame. 


336 10 Relativistic Reactions 


Fig. 10.10 Annihilation 
et —e 


Example 10.7.2 (Isotropy and constancy of c) A beam of positrons of energy E 
hits a target of electrons, which is resting in the lab. From the scattering, photons 
are produced, which are detected with two counters A, B placed in the plane x — y 
at equal distances from the target and making an angle @ with the axes x and y 
respectively (see Fig. 10.10). 


1 


. Calculate the energy FE as a function of the angle @ (and the mass m of the 


electron). What happens when the detectors A, B are placed at an angle ¢ = 
45°? 


. Consider the pair eT, e~ as a source of photons and show how it is possible to 


demonstrate that the speed of light is isotropic and independent of the speed of 
the emitter. 


Solution 


1. 


The reaction is: 


er+te >yty 
1+ 2 3 4 


In the lab frame (=proper frame of the electron) the energy of particle 1 is: 


M2 — 2m? 


1 
E=£E,= 
B ! 2m 


(10.70) 


where M is the mass of the CM particle. Squaring the conservation equation of 
four-momentum, we find the mass M in terms of the scalar product p3 - p4: 


— Mc? = 2p’ - pi. (10.71) 
Conservation of 3-momentum along the y- axis gives: 


E3sing = E,cos¢ > E, = E3tan@. (10.72) 


10.7 Transformation of Angles 337 


Conservation of energy in the lab frame gives: 


E, + mc? 


E 2 fad yes fy 
1+me 3+ £4 3 i+tand 


(10.73) 


In order to compute M we decompose Ph, pi, in the lab frame: 


1 1 
i_(r cos i_c(r —sing 
P3 = (£3/c) ang , Py = (E4/c) eeee 
0 L 0 L 


and replace the unknown energies £3, E4 from relations (10.72) and (10.73). 
The result is: 


i, pi E3 : a 2\? 
P3° P4= ——p ane + sin2¢) = — sin 2¢ (z +mc ) : 
We replace the inner product Ps, . p, in (10.71) and find the mass: 


(E, + mce?)” 


M? = sin2¢ Z 
c 


Therefore, the energy of the photons in the lab frame in terms of the angle ¢ is: 


_ 2—sin2¢ 4 
E, = ———mc". (10.74) 
sin 2@ 


The factor y (@) of the positrons in the lab frame is: 


E\ 2 — sin2@ 
VAG ee sin2pd — 


mC 


For @ = 45° we find y(45°) = 1, that is, the positron reacts at rest in the lab 
frame (threshold of the reaction) and the energy has its minimal value E, = mc?. 
[Exercise: Compute the derivative rat and verify that the minimum value of the 
energy E, occurs for the value @ = 45°]. 

From equations (10.72) and (10.73) we compute F3(45°) = E4(45°) = 
mc*, which means that the emitted photons have the same frequency, which is 
determined by the mass of the electron only. Kinematically the angle @ measures 
the £ factor of the positron in the lab frame and the value 45° corresponds to the 
threshold of the reaction. 


338 10 Relativistic Reactions 


2. We consider the pair et, e~ as a source of photons (photon emitter), whose 
velocity is the velocity of the CM particle. Then we have the following 
situation: 


¢ For the various values of the angle ¢ the detectors A, B detect photons with 
energies £3, E4, which are fixed uniquely by the value of the angle @ and the 
energy E, of the positrons in the lab frame. 

e The speed of the photon emitter in the lab is determined by the energy F}. 

e The detectors A, B are at equal distances form the photon emitter, therefore, 
if the speed of light is constant, the photons will be detected simultaneously 
at the detectors A, B (for all angles @). 


From the above, we conclude that in order to prove the independence of the 
speed of light from the velocity of the photon emitter, it is enough to measure the 
time difference between the photons received at the detectors. 

This leads to the following experimental procedure. For a given energy E, of 
the positrons, we measure the frequency of events registered by the detectors and 
determine the time difference for each event. If the speed of light is independent of 
the speed of the photon emitter, then most events for each energy EF (and angle ¢) 
will occur at the value At = O (simultaneous events). The value ¢ = 45° has no 
special significance, besides the fact that it corresponds to the minimum value of 
the energy £1, therefore it is meaningless to continue the measurements for angles 
larger than 45°. 

This experiment has been realized by Sandeh® for various values of the positron 
energy in the lab. The experimental measurements (see Fig. 10.11) did indeed show 
that most events occur for simultaneous events ( At = 0). If we rotate the whole 
measuring apparatus around the z—axis and repeat the experimentation, then we 
find the same result. This proves the isotropy of the speed of light. 


10.7.3 Elastic Collisions: Scattering 


We say that a reaction or particle collision is elastic if the number and the identity 
of the reacting particles are preserved. The most well known elastic scattering is 
the Compton Scattering, in which a beam of X-rays is scattered on a source of free 
electrons according to the reaction: 


e+y—>yte. 


6D. Sandeh Physical Review Letters, 10, 271 (April 1983). 


10.7 Transformation of Angles 339 


Fig. 10.11 Sandeh’s 
experimental curves 


C constant 


mae C not constant 


2000 


¥ 


Onmihiation 
at rest 


1900 


200 


150 ae 
annihilation 
in Aight 


< 


Events per channel 


100 


50 


30. 40. 50 60 70 


Chanel Number 


Other well known reactions of elastic scattering are: 


¢ Rutherford scattering: 
a+ nucleus — a+ nucleus at the same state 


¢ Proton — proton scattering without the production of other particles: 


p+p—>pt+p 


In this reaction it is not possible to relate a given mother proton with a specific 
daughter proton, that is which initial proton corresponds to which final. We can 
only say is that two protons are reacting and two protons are emerging form the 
reaction. 

e Elastic collision of pion with proton: 


a +p>mr +p:p. 


The physical quantities we wish to find in an elastic collision are the energy 
and the angle of scattering of the daughter particles in the laboratory, in terms of 
the energy of the bullet beam of particles. In an elastic scattering, as a rule, the lab 
frame coincides with the proper frame of the target particle. However, in the study of 
elastic collisions in space, the lab frame is the frame of distant stars. Such collisions 
are the elastic scattering of particles of cosmic radiation of very high energy on 
thermal photons, which are radiated from stelar objects. During these collisions it is 
possible that these particles lose most of their energy, whereas the scattered photons 


340 10 Relativistic Reactions 


Fig. 10.12 Elastic collision a 

in the lab frame and the CM at 

frame 6) oo 
ee ee ee Bade at & 


2 we 
Sig nee = 


increase respectively their energy with the result to increase their frequency from the 
visible to y—ray frequencies. Similar scattering phenomena happen in accelerators 
with the head-on collision of laser beams with a beam of accelerating electrons. 

In order to study the elastic scattering, we write the reaction in two stages with 
the intermediate CM particle: 


A+B>P->A+B 
1 2 3 4 


and assume that the lab frame coincides with the proper frame of particle 2. 
We compute first the quantities of the CM particle P. For the mass M we have 
(Fig. 10.12): 


M2 — m2 — m2 
LE = E,= gag = M = ym? + m3 + 2m2E\/c2. 


The £* factor of P in the lab frame is: 


Din DA 
Ey myc" 


p* = 


i 
E, + moc? 
and the y* factor: 


_ E| + myc? 
. Mc2 


* 


Next, we compute the energy and the scattering angle of the daughter particle 3 of 
mass m,. From the conservation of four-momentum we have: 


(rs) = (a fog pi). (10.75) 


In the lab frame, the decomposition of the four-vectors is: 


a) i Ga, i cy i eng, 
Pi =( » PL= > P3= > Pa= 
; Pi L . 0 Z, p3 L : p4 L 


10.7 Transformation of Angles 341 


We note that the angle 03 is given by the expression: 
Pi - P3 = Pi P3 cos 13 


where P;, P3 are the lengths of the 3-momenta of the particles 1, 3 in the lab frame. 
Replacing in equation (10.75) and solving for cos 613 we find: 


—2mic4 + (£3 — Ej) mc +E, E3 
P P3c? ‘ 


cos 013 = 


The unknown quantities in this equation are the quantities P,, P3. The length P| is 
computed from the relation: 


Pic= ,/E i - mic4 
while for the calculation of P3 we need to know the energy £3 of particle 3 in the 
lab frame. This is computed as follows: 
The length P;‘ of the 3-momentum and the energy E3 of particle 3 in the CM 
frame are given by the relations: 


1 
Px =)3, p |=——fA(M2, m2, m3), EX = 4 P#?2c2 +. mic4. 


The boost relating the lab and the CM frame gives for the energy £3 of particle 3 in 
the lab: E3 = E=y* (EX+ B*c P cos Gf) , where 6; is the scattering angle in 
the CM frame and £*, y* are the velocity factors of the CM frame in the lab. 

A different method to compute the scattering angle in the lab frame is by the use 
of the transformation equation (10.48) for the angles or, equivalently, to work as 
follows. 

We decompose the 3-momentum of particle 3 along and normal to the direction 
of motion of particle | in the lab frame. Then the boost between the lab frame and 
the CM frame gives: 


; 13 (Px cos 675 + B*EX/c * 
cotOy3 = LPI - iP — - al) T3 (cote + ne) 
TPL P; sin 07, B3 sin 073 


where BS ; r3 are the 8, y factors of particle 3 in the CM frame and are computed 
as follows: 


Py E3 
By = — (cos6i+ sindsj) , i= —. 
3 ES ( 13 13) as 


Exercise 10.7.4 Consider two particles of non-zero mass, which collide elastically 
and show that in the CM frame the speeds of particles remain unchanged after the 
collision. 


342 10 Relativistic Reactions 


Example 10.7.3 (Compton scattering) A photon is scattered elastically on an elec- 
tron, which rests in the laboratory. Calculate the energy of the scattered particles in 
the laboratory as functions of the energy of the incident photon and the scattering 
angle of the photon in the lab frame. 


Solution 
The reaction is: 


ytewryte 
1 2 3. 4 


Conservation of four-momentum gives pi, = pi + pi, = Ph, which after squaring 
becomes: 


Pi P2i - Pi P3i = PS D3i = 0. 


The decomposition of the four-momenta in the lab frame is: 


= 25) pa") r= 2(2) i= (“/*) 
: Cc i ny 7 0 ny : Cc 63 a . P4 = 


We compute the inner products in the last equation and solving in terms of E3, we 
find: 


mc? 


~ Eid —cos613) + mc2 


Ey E| 


where 613 is the scattering angle. This relation gives the energy E3 in terms of the 
required quantities E,, 013. We introduce the quantity € = mc*/E, and have the 
dimensionless ratio: 


E3 - c = € 
Ey 1+ € —cos 643 E+2sin? 40,3 


(10.76) 


For the energy EF, of the scattered electron in the lab frame we have: 


(Ei + mc’) [Eid — cos 013) + mc? | —mc?E; 


E4 = E, +mc* — E3 = 
* i E\(1 — cos 643) + mc 


We note that the ratio: 


Eq _ pace Dine sis) 


10.77 
Es ‘5 ( ) 


It remains to compute the angle 634 between the scattered particles in terms of 
the quantities E;, 6,3 or €, 013. We consider the equation of conservation of four- 


10.7 Transformation of Angles 343 
momentum in the form: 
i i_ i i 
Pi + Pp = P3 + Pa 
and squaring we get: 


Pi Pri = PhDP4i- 


We calculate the inner products in the lab and find: 


E3E4, £3 
E\m = aor P4.cos 634. 
c c 
We solve this equation in terms of cos 634 and we write the result in terms of the 
sy £3 E3, 
ratios = and E;: 
E3E4—E\mc? _ E4/E3 — (E1/E3)°€ 
cos 634 = = . 


2 
E3,/ Ei — m?c4 Bae ee 
E2— E2/E2 
3. E3 


Replacing Be, Ey from (10.76) and (10.77) we calculate the angle 634 in terms of 
the quantities €, 013. 

A different way to compute the angle 634 is to compute the recoil angle @ of the 
electron. This is done as follows. 

First we note that the 3-momenta of the involved particles are coplanar (why?). 
Then, conservation of 3-momentum along and normal to the direction of the incident 
photon gives the equations: 


|[p3| sin@\3 = |p4| sing 
[pil = |p3| cos 613 + |p4| cos @. 


Dividing these equations we find: 


E3 & 
Bis |p3| sin 013 Es sin 613 
an = = , 
Ipil — |p3|cos@13, 1 — a cos 613 


where we have used the fact that |p;| = a9 |p3| = 2s Replacing a from (10.76) 
and applying standard trigonometry we find (Fig. 10.13): 


1 
cot 5013 
2 
tang = 
1 LE 


We note that tan@ is always positive, therefore the electron is always thrown 
forward. 


344 10 Relativistic Reactions 


Fig. 10.13 Compton we 
Scattering in the lab frame ee 
and the CM frame y < 3} 
o—>_--@C----- ‘ 2 
e } 
Lab frame 


Finally, let us compute the electron’s recoil speed. Conservation of energy gives: 


mc 2 E mc 


E4 = E, +mc* £3 = +mc 
ec E+2sin? 403 € 


1 2046) sin? 16, 3g 
= mc €. 
1 +3 sin? 5013 


But £4 = mc y4, where 74 is the y factor of the recoil electron in the lab frame. 
Therefore: 


(10.78) 


From this the 6 factor is computed as: 


= sin 5 $613,/1 + Ht sin? 5013 


pa = 
fe ab 


a 5013 


Exercise 10.7.5 Determine the conditions for which the photon is scattered oppo- 
site to its initial direction (back scattering). Note that in this case 013 = 034 = 1. 
Similarly study the forward scattering for which 013 = 634 = 0. Show that in the 
forward scattering the energy of the scattered photon is maximum ile equal to E), 
whereas for the backward scattering it has the minimum value x x E. Also show 
that the velocity of the recoil ene aie from zero for the forward scattering to 


the maximum value B4,max = Fe 


Example 10.7.4 A particle of mass m and kinetic energy 7 in the laboratory 
is scattered elastically on another similar particle which rests in the laboratory. 
Calculate the kinetic energy of the scattered particle in terms of the kinetic energy 
of the bullet particle and the scattering angle. Determine the maximal value of the 
scattering angle. 
Solution 

The reaction is: 


A+B->A+B 
1 2 3 #4 


10.7 Transformation of Angles 345 
Conservation of four-momentum gives: 
Pa = Pi + Py - PS 
and upon squaring: 
ne = Pi P2i a Pi P3i _ P)D3i- 
We calculate the inner products in the lab frame and find the equation: 
mc = —E\ym—p,-p3+ = (= +mc) => 
Pi: p3 = (£3 — mc”)(Ey + me’)/c°. 


But: 


cp: P3 = P1p3c cos 613 = (2 = m>c4,/ E2 — m*c* cos 613. 


Replacing in the last relation, we find: 


The kinetic energies of the initial and the scattered particles in the lab frame are: 
T| = E, — mc’, T3 = E3 — mc’. 
Replacing in the last relation, we find: 


2mc? T, cos” 013 


= : 10.79 
- 2mc? + T| sin? 613 ( 


We note that the maximal value of the scattering angle 6,3 occurs when T; = 73, 
that is, when the kinetic energy of the bullet particle is transferred to the scattered 
particle. Setting 7; = 73 in (10.79) we find: 


(7 oe 2mc*) sin? 613 = 0 > 613 = 0°. 


Chapter 11 ®) 
Four-Force a 


11.1 Introduction 


In the previous chapters we considered four-vectors, which describe the evolution 
of a relativistic system in spacetime without taking into account the environment, 
which modulates motion. In the present chapter, we develop the dynamics of 
Special Relativity by introducing the four-vector of four-force. However, the real 
power of relativistic dynamics is via the introduction of the Lagrangian and the 
Hamiltonian formalism, which we shall also consider briefly. In what follows, we 
present a number of solved problems, which will familiarize the reader with simple 
applications of relativistic dynamics. 


11.2. The Four-Force 


Consider a ReMaP P with four-velocity u' and mass m(r), where t is the proper 
time of P. We define the four-vector (=potential relativistic physical quantity): 


. d , dp' 
F' = —(m(t)u') = — 11.1 
qn! (T)u') = (11.1) 
where p! = m(t)u! is the four-momentum of P. The four-vector F’ we name 
four-force acting on P. Because P has only mass, the only reaction it can support 


is the change of the mass dm The remaining dynamical fields (electromagnetic 


field, forces due to other mechanical systems e.g. springs etc.) demand a ‘charge’, 
therefore they have an effect on the four-acceleration only. This is the reason that 
gravity has a different role from the rest of the dynamical fields i.e. interacts with 


© Springer Nature Switzerland AG 2019 347 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_11 


348 11 Four-Force 


the mass.! This becomes clear if we write the four-force as follows: 


i _ am | i 
Fo = —u 4+ma. (11.2) 
dt 

The first term is parallel to the four-velocity and the second is normal to it. Each 
part we associate with a different Newtonian physical quantity. Indeed, as it will be 
shown, the timelike part concerns the total variation of the energy of P due to the 
action of an inertial 3-force f (that is the force which changes the kinematics of P) 
and the spacelike part is defined in terms of f. 

Before we continue with the study of the four-force, we must prove that it 
corresponds to a relativistic physical quantity and it is not simply a four-vector 
without physical significance. To do this, we consider the proper frame &* of P 
and from (11.2) we obtain: 


. d dm 
r= @ (6) +m(2) = (| (11.3) 
dt \0/,. avy yy mat) 5. 


If in =* we identify the invariant i with the Newtonian physical quantity (F) 54 


and the space part mat with the Newtonian 3-force f;+ then the components of the 
four-vector F' attain physical significance, therefore this four-vector represents a 
relativistic physical quantity. 

In order to understand the physical meaning of the two parts of the four-force 
in an arbitrary frame ©, we consider the decomposition of F! in D. If the four- 


. : E/c 
momentum p’ in © is ( / ) , we have: 
x 


. i r 
ra P ite (Ee) Lg 270 (11.4) 
T dtdt\ p Js f . 


where f = p is the 3-force in © and E = aE The quantity E is possible to be 
computed in two ways. Either from the relation: 


/ . 1 c? dm 
= 22 274 = 2 ~ 4\ 


‘Recall that gravity interacts with the gravitational mass, whereas the mass which enters Newton’s 
Second Law is the inertial mass. However, due to Edtwos experiments we identify these two masses 
(Equivalence Principle) in Newtonian Gravity. The same Principle is assumed to hold in General 
Relativity. 


11.2 The Four-Force 349 
or, by using the invariant F‘w;. In this case we have: 


dm , | d 
Flea (Su +ma' wea (11.6) 


In & the value of the invariant is: 


diz 
F'uj = (>) ; (‘) =y*(-E+f-u). (11.7) 


Equating the two expressions and solving in terms of E we obtain again: 
. 2 
E=——+4+f-u. (11.8) 


Verbally, the above equation can be stated as follows: 


Chanpeee the ; (Change of Rate of penn 
c* | the internal energy of work in & by 
total energy of P | = —5 
nt y (mass) of P the external forces 
in its proper frame acting on P in &. 
(11.9) 


Equation (11.9) is the conservation of energy in Special Relativity. The new 
element is the relation of the (proper) mass of P with the work produced by the 
external force f acting on P in &. 

From equation (11.2) follows that one is possible to classify the four-forces in a 
covariant manner using the vanishing or not of the invariant F’u; = — Gh? The 
four-forces F', for which dm = 0, we call pure or inertial four-forces and are the 
ones that create motion in 3d-space and correspond to the Newtonian forces. The 
second type of four-forces is defined by the requirement a’ = 0 and correspond to 
four-forces which do not produce motion in the 3-d space. These four-forces we call 
thermal. The general expression of a thermal four-force is: 


;_ dm ; 
F= a : (11.10) 
& 


To understand this type of force consider a ReMaP P, whose four-velocity is u’ and 
in its proper frame it is heated (e.g. by means of the flame of a candle) or is loosing 
mass as it is the case e.g. with a rocket. Then an # Oand on P acts a thermal force 


F'. The 3-momentum of P varies as follows: 


dp =dmu 


350 11 Four-Force 


which implies that on P acts the 3-force: 


_ dm 


f= —u. 
dee 


(11.11) 


The change of the energy E of P in ¥ is given by equation (11.8). For a pure 
force we compute: 


E=f-u (11.12) 


that is, we recover the corresponding expression of Newtonian Physics. 
For a thermal force: 


E ci dm 5 ch dm | dm dm 5 
= — -Uu= u= c 
y2 dt y2dt dt dt 


that is, the rate of change of energy is independent of . In this case, the energy: 
dm 5 
E(t)= Ane dt + constant . (11.13) 
T 


Let us consider an example with a thermal four-force. 


Example 11.2.1 A black (i.e. absolutely absorbing) surface is resting in the labora- 
tory. At what rate photons of wavelength A must fall normally to the surface in order 
to exert a 3-force F N normal to the surface? 

Numerical application: 7 = 5 x 10-7m, F =10°N, h = 6.63 x 107" Js, 
S=0.01m?. 

Irst Solution 

Let J be the number of photons that fall on the surface S per second and square 
meter. Because the surface is black, all photons are absorbed. This implies that per 
second and m? the photons transfer on the surface momentum: 


The force F due to change of the 3- momentum is: 


pay ge 
"4 ~ hs’ 


Numerical application: 


10°5N x 5x 10-7m 


= = 7.54 x 10” photons. 
6.63 x 10-24 Js x 0.01 m2 Spar Eanes 


2nd Solution [with the use of equation (11.13)]. 


11.2 The Four-Force 351 


Per unit of time (s) and surface (m”) on the surface fall J photons of energy 
E=hv= Be Therefore, the increase of the mass of the surface per s and m? is: 


dm es ge 
dt ° ire Ae 


The thermal force, which is produced form this increase of proper mass equals: 


where Z is the unit normal to the surface and parallel to the direction of the falling 
photons. 


In the following we shall consider pure forces only, because these are the ones 
that interest this book. This implies that in the sequel the following relations are 
assumed to hold: 


Fi =ma' (11.14) 
, lf. 
Fi acs =) (11.15) 
f x 
E=f-u=myc’. (11.16) 


If we replace in (11.14) F! and a’ in terms of their components in an arbitrary LCF 
, we find equation (11.16) and the equation: 


f=myu+mya (11.17) 


which defines the 3-force f in terms of the 3-acceleration a. This expression can 
be simplified significantly. Indeed, from equation (7.4) we have y = sy3(u -a) 
therefore replacing in (11.17) we find: 


1 1 
f= my(sy"(u -a)ju+a) =my (Sy uayuey +a) +a_) 
—my [ 76? + Lay +a, =my (ya) + a4). (11.18) 


Exercise 11.2.1 


a. Show that equation (11.18) is compatible with equation (11.17). 
[Hint: Make use of (11.8)]. 
b. Show that equation (11.17) can be written as f = 2, where p = myu, 
(m =constant) and t is time in X. This result is used in the solution of problems, 
when the force f is given. 


352 11 Four-Force 


From equation (11.17) we note that, although the expression of the four-force 
F' = mai’ has the form of Newton’s Second Law, the corresponding 3-force is 
related to 3-acceleration in a different manner due to the presence of the extra term 
myu, which is an “apparent” force related to the 3-velocity of P. This term never 
vanishes except if 7 = 0, in which case (prove this result) a’ = 0, therefore 
f = 0. Furthermore, the three 3-vectors f, u, a are coplanar, but the direction of 
f is different from that of a. 

This fact has lead people to consider for the relativistic inertial 3-forces f two 
“masses” with reference to the direction of 3-velocity. The parallel mass (7m) and 
the perpendicular mass (m_.) The calculation of these masses is as follows: 

Projecting (11.18) along u we find: 


f= = mya (11.19) 
therefore: 
my = my? (11.20) 
Working similarly we find for m 7: 
mi =my. (11.21) 
A useful observation concerning the one dimensional problems is the following. 


In these cases the 3-force is collinear to the 3-velocity, therefore form (11.19) we 
have for any type of force the condition: 


2 
[ fends =mear— (11.22) 
1 


that is, for a one-dimensional motion under the action of any type of (smooth) force 
Ff (x) the speed is given by the first integral (11.22). 


Exercise 11.2.2 (Conservation of mechanical energy) Consider two events 1, 2 
along the worldline cp of a ReMaP P. By making use of (11.16) show that for a 
pure force in an arbitrary LCF &: 


2Check this result with (11.18). 


11.2 The Four-Force 353 


C2 
£1 = | f-vdt (11.23) 


ec 


where v is the 3-velocity of P in &. Making use of the relation E = mc” +T, where 
T is the kinetic energy of P in X, show that the change of kinetic energy in & is 
given by the expression: 


c2 
Bah =i f- vdt. (11.24) 
al 


Conclude that ifm =constant, then in any LCF & the following Law of conservation 
of energy holds: 


Change of the Work done by 
Kinetic energy of P in & = the external inertial (11.25) 
between the events 3-forces on P in & between 
1,2 along its worldline the events 1,2 


Prove that in the case of one dimensional motion relation (11.22) follows as a 
special case of (11.24). 


Exercise 11.2.3. A ReMaP P is moving in the LCF & under the action of the 3-force 
f. Let &’ be another LCF, whose axes are parallel to those of & and its velocity is 
u. Prove that the 3-force f and the rate of change of energy of P in X' are given by 
the following relations: 


u-f 1 
dE’ 1 (dE 
aa ( wf) (11.27) 


where Q = ( - ur) and V is the velocity of P in X. 


In the special case X&, &X' are related with a boost along the direction of the 
common x—axis then the above relations read: 


be “of 11.28 
=a {fe- ae vu (11.28) 
! fy 

Paes oe 11.29 
i 7; ( ) 
/ he 

a 11.30 
it ‘no ( ) 


[Hint: Apply the general Lorentz transformation (1.51) and (1.52) to the four-force. ] 


354 11 Four-Force 


A direct application of Exercise 11.2.3 is the following Example. 


Example 11.2.2. Let & and &’ be two LCF with parallel axes and relative velocity 
u. A ReMaP P is moving in & under the action of the 3-force f, which is always 
normal to the direction of the 3-velocity of P in X. 


a. Show that the total energy of P in & is a constant of motion 

b. Show that the energy E’ of P in x’ is not in general a constant of motion. 

c. Apply the result of a. in order to show that the total energy of an electric charge, 
which moves in a LCF & under the action of a magnetic field only is a constant 
of motion. 


Solution 


a. Let v be the velocity of P in X. From (11.16) we have in =: 


E=f-v=0= E =constantin ¥. 


b. Equation (11.27) gives: 


dE’ 1 . 
ae o' u-f)40 (exceptif u-f=0). 

c. When an electric charge g moves in a LCF &, in which there is a magnetic field 
B only, the force on the charge is the Lorentz force f = kqv x B (k is a constant 
depending on the units). This force is always normal to the velocity of the charge, 
therefore according to a. the energy of the charge is a constant of motion in &. In 
D’ the electromagnetic field will have in general and electric field* hence in D’ 
the Lorentz force on the charge is f’ = g(E’ + v’ x B’), which is not normal to 
the velocity of the charge. Therefore (in general), the total energy of the charge 
is not a constant of motion in X’. (Does this result contradicts b?) 


In the following examples we show how one studies the motion of a ReMaP, 
which moves under the action of a given 3- force. 


Example 11.2.3 A particle P of mass m is moving along the x—axis of the LCF & 
under the action of the 3-force: 


f 2mc?a 
ha? 

where a is a positive constant. Assuming that the motion starts form rest at the origin 
of &, prove that P is at the position x (x < a) along the x—axis of & after a time 
interval: 


3This will be shown in Chap. 13. 


11.2 The Four-Force 355 


1 /x 
t= —,/—(x + 3a). 
3cVa 


[Hint:It is given that: 


2(ax — 2b) 


d 
[sS- a2 Vax+b+C }. 


Solution 
The motion of P is one-dimensional, therefore (11.22) applies,+ which gives in 
= taking into consideration the initial condition: 


i: f(x)dx = me*(y — 1). (11.31) 
0 


Replacing f(x) in the lhs we compute: 


2 2 
elses dx = 2mc* ; 
(a — x”) a-x 


Replacing in (11.31) we find 


a+x 
Y= 
a-x 
from which follows 
‘ 2 
1 
=o -(*) gage 
/, _ wv a-x a+x 
C2 


Concerning the calculation of x(t) we have: 


=2 > 
dt “atx Qo Jax 


1 /x 
t= —,/—(x + 3a). 
3cV a 


Example 11.2.4 A particle of mass m is moving in the xy plane of the LCF © under 
the action of the constant force F = fj. If at t = 0 the particle starts to move from 
the origin with 3-momentum p(0) = poi, show that the equation of motion of the 


dx = 


dx Jax *~a+x 2 far 
0 


“The same result we find if we consider the conservation of energy given by relation (11.24). 


356 11 Four-Force 


particle in & is: 


where Ep = ,/ (ppc? + m2c4). 

Compute the Newtonian limit. How this result is related to the projectile motion 
of Newtonian Physics in a constant gravitational field? 
Solution 

The equation of motion of the particle in & is F = op from which follows: 


qPx _ 4 GD) op GPs 
dt dt 


The first and the last equations give: 


Px(t) = Px (0) = P0 
pz(t) = pz(0) = 0. 


From the second equation we have: 


Py = ft+ po) = ft. 


Hence, the 3-momentum in » is: 


P = (po, ft, 9). 
The 3-velocity of the particle in & is: 


2 2 
et ae = 


p 
my E cy m2c? + p? 


C2 


cy/m2c2 + po t+ 0? 


(po, ft, 0) 


C2 


= (po, ft, 0) 


(ER + f21202 


11.2 The Four-Force 357 


where Eo is the total energy of the particle at t = 0. From this relation follows: 


dr oe 
= (po, ft, 0) > 


dt eG + f2t2c? 


t 
f= “CSO = wv e=h) = / ae EG, ROVE 
0 Eo + f2t?c? 


Using the initial condition r(0) = 0 we find: 


t 
x(t) = por dt = sinh7! 2— 
0 [E24 721202 ii Eo 


f tc? E a 
yn = f ea ey ee 1+(2) 
0 J E24 £2122 Ed Eo 


z(t) = 0. 


The orbit of the particle is in the plane xy. Eliminating t we find the equation of 
the orbit: 


y= 2 (1+ com). 


Poc 


In order to find the Newtonian limit, we expand cosh around the point x = 0 and 
find: 


E aK E 
y= ipl oe = of 
f 2 pec" 2 ppc? 
Ignoring x* terms, we obtain the result: 
~ Bef 5 mero Fs 
=5022* = 722% = 7* 
2pae 2m 6 Cus 2MyYoug 


which coincides with the orbit of a Newtonian projectile in a constant force field 
f = fj with initial velocity uo « c (so that yo = 1). 


Example 11.2.5 (Hyperbolic motion) <A particle of mass m is moving along the 
x—axis of the LCF & under the action of the constant 3-forcef = fi (f = 
constant in &). 


358 11 Four-Force 


1. Assume that the particle starts its motion from rest at the origin of & and show 
that its position x(t) is given by the expression: 


2H= c(i +212 — 1), 


Display the quantities y(t) and Ex(0) in terms of tf, where u(t) is the velocity 
of the particle. 
2. If t is the proper time of the particle show that: 


1 
x(t) = 7 (cosh kt -—l1) , — k sinh kt. 


Making use of the above results, show that the length of the position four-vector 
x? — c*1? is a constant of motion (hyperbolic motion, see Sect. 7.4). 


Solution 


1. The equation of motion of the particle in & is: 


Integrating and taking into consideration the initial condition v(0) = 0 we find: 


t 
os 
™ 11+ (Et) 
We introduce the constant k (with dimensions [T~!]) k = t and write: 


ckt 


V1 + 202 


Integrating again and using the initial condition x(0) = 0 follows: 


x(t) = iI + k242 — 1). (11.32) 


v(t) = 


The graph of the quantities fu(kt), Ex (kt) in terms of ¢ is shown in Fig. 11.1. 
2. Let t be the proper time of the particle. Then: 


1 dt i coe 
dt =—dt= => t= —-sinhkt. 


y V1 + kt2 k 


11.2 The Four-Force 359 


Fig. 11.1 Position and speed 
in one dimensional 
hyperbolic motion 


Replacing ¢ in x(t) we find: 


x(t) = 7 (cosh ke =f); (11.33) 


The last part is obvious. 


In the next example we show that the 3-force is not necessarily collinear with 
the 3-acceleration, a characteristic which differentiates drastically the relativistic 
dynamics from the Newtonian. In general, the accelerated motions (for large and 
persistent accelerations) give rise to paradoxes, which is not easy to understand or 
explain and must be considered with great care. 


Example 11.2.6 A particle of mass m moves in the xy plane of a LCF & under the 
action of the 3-force f = f,i+ f\j. Ifa = a,i+ a,j is the 3-acceleration of the 
particle in &, show that: 


[a (1 — 2) + ayBxBy] 
f=" ait 


22 
fy — mo Bx) + axBx By] (11.35) 


(1 = B2 = p2)3? 


These relations show that the 3-force acting on the particle and the resulting 3- 
acceleration are not collinear (both in &!) 

Using these relations prove that in order a, = 0, the 3-force must have compo- 
nent along the x—axis. Compute this component and give a physical interpretation 
of this “paradox”. 

In order to study the relation between the components of the 3-force and the 
components of 3-acceleration introduce the quantity R = £ and show that: 


ec (11.36) 
ax ~ Rk3 — ko , 


360 11 Four-Force 


2 2 
where kj = 1 + kn = a, kg = 1 “+. Consider the special case 6, (t) = 


By(t) and show that: 


=k =1=bh. 


Display ra as a function of kz for the values R € {1.0, 1.1, 2.0, 10.0, oo}. Comment 
on the result.> 

(Hint: 7 = py’). 
Solution 

Because the motion is in the plane xy the component u, = 0. We compute: 


d dy I 3 
fe = ae eee = a +mya, = ae (Uxax + Uyay) + my ax 
p (8 iba ~) 
= mpxy a ay +s 
BN ene NOD eB, 
1— p2 - BY 
= mBxy> By dy + Byday + =o = 
* 


fe =my? [a — By)ay + BxByay |, 


Similarly we show the result for fy. 
We set a, = 0 in the last relations and we get: 


fe =myrayBxBy and fy = myay(1 — B2) 
from which follows: 


_ , BxBy 
Eat 


The component f, of the 3-force along the x—axis when the 3-acceleration vanishes 
is explained as follows. Condition a, = 0 implies uv, = constant. However, the time 
derivative of the momentum p, = my (uz + us ux does not vanish because y ¥ 0, 
therefore the component f, 4 0. 


5More information on this interesting approach the reader can find in the following references: 


1. G. W. Ficken “A relativity paradox: The negative acceleration component” Am. J. Phys. (1976) 
44, 1136-1137 

2. P. F Gonzalez Diaz “Some additional results on the directional relationship between forces and 
acceleration in Special Relativity” IL NUOVO CIMENTO (1979), 51B, 104-116 


11.2 The Four-Force 361 


We define R = : and replacing in the expressions which give f,, fy we find: 


Jy 


= ayky + ayk = ky + ky > 
- ayky + ayk3 ~ ky + k3 = 


ay ky — Rko 


a,  Rk3—ky 
where ki = 1 — 83, ko = By By, kj = 1— B?. 
In case Bx(t) = By(t) follows easily that kj = k3 = 1 — ko, B2 = 56”. 


Furthermore, y = SF hence: 
_ 1 


dy 1-(1+ R)ko 
ay R—(1+ R)ko 


(11.37) 


The graph of © (ky) is shown in Fig. 11.2. 


Since kz = 5 f° the deviation from the Newtonian result appears when the speed 


u has large values. For small speeds ky —~> 0 and “ > " = *, which is the 


expected Newtonian expression. It is to be noted that when P= O and fy 4 0, 
that is, when there is force only along the x—axis the R = oo and from the graph 


y 


follows that “ < 0 hence the quantity a, measures retardation. In this case (11.35) 


gives: 
: —_ 0 
dy = ——ay = (1-—— Ja < 
x y ib) > 
kx — kyk3 1 
fr = my? ay = V3 (2-Z)a>0 
2 2 
Fig. 11.2 > when uy 
Uy(t) = vy (t) = 


0.9 


© 
on 


0.0 


362 11 Four-Force 


When fy = 0 then (11.34) gives ay = — 85a, which means that a, < 0 and 
dy < dy because 0 < kz < 1/2 as it can be seen from Fig. 11.2. From this relation 
follows: 


k 
fy =my3(1— ko) (: s J 
bake 


2; 


We note that as kx —> 1/2 the f —> oo because y —> oo. 
If the component f, has small value, e.g. f, = fy/10, then from the 


y 


graph Fig. 11.2 follows that for this value of R the quotient 2 > 0. The vanishing 


of o occurs when kz ~ 0.1 which implies that the component f; of the 3-force 


“finishes” at the value ky ~ 0.1. Finally, when R = 1 from (11.37) follows 
o = Ee = |, that is the quotient o is independent of kz. This is a limiting 


Exercise 11.2.4 


a. Using the identity of the (Euclidian) 3-d vector calculus: 
Ax(B x C) = (A- C)B— (A- B)C 


show that: 


Bx(B x A) 


- (11.38) 


1 
A= 75(A- B)B- 


This relation decomposes the vector A normal and parallel to the vector B. 
b. Take B to be the unit vector @, along the direction of the velocity and show the 
relation: 


1 ux(v x u) A 


Subsequently, show that when f£, 4 0 the f can be written as a Lorentz force as 
follows®: 


. 1 
f= “a latov x h| (11.40) 
where h = ux, is the “magnetic field” and V = u+ eRe, is the “corrected” 


velocity in the LCF © and R = Q [fi - ara . tl 
Hint: Use (11.26). 


6A special case of this result can be found in D. Bedford and P. Krumm (1985), Am. J. Phys. 53, 
889. 


11.3 Inertial Four-Force and Four-Potential 363 
11.3 Inertial Four-Force and Four-Potential 


In order to develop the dynamics of Special Relativity along the analytic formalism 
of Lagrange and Hamilton, we have to introduce potential functions for four- 
forces. It is obvious that we shall consider only inertial four-forces because analytic 
dynamics concerns inertial motion and not e.g. thermal forces. 

The first type of potential functions we look for, are the (Lorentz) invariants.’ 
Consider a ReMaP with four-velocity u! which is moving under the action of the 
inertial four-force F'. Suppose @(/, r) (J = ct) is an invariant such that: 


Fi = —-q¢,i (11.41) 
where q is a constant. The spacelike character of F' implies the constraint: 
Fiu' = —qoju' =0 


for all four-velocities u'. The geometric meaning of this constraint is that the 
“‘equipotential surfaces” @ = constant are (Lorentz) orthogonal to the four-velocity 
u', hence they are spacelike hypersurfaces. Since the four-velocity is arbitrary, this 
implies ¢,; = 0 (why?). Clearly this is impossible, therefore it is not possible to 
consider an invariant (i.e. scalar) four-potential. 

Our next choice is a vector four-potential ®;(/, r) say. In this case, we assume 
that the four-potential is related to the four-force F' as follows: 


F = 4(@;,; — © ),)u/. (11.42) 
ae 


We note that the constraint F;u' = 0 is satisfied identically for all u‘, therefore it 
is possible to consider such four-potentials. In the more general case, that is, when 
there does not exist a four-potential, it is still possible to write the (inertial) four- 
force in the form Fj = Qyju', where {2;; is an antisymmetric tensor. Such four- 
forces are due to various dynamical fields. 

Now we come to the question: Suppose one is given a 3-force f, which in the 
LCF »® is conservative (that is, it can be expressed as the derivative of a scalar 3- 
potential — not Lorentz invariant in general). How one will compute a four-potential 
for the four-force defined by this 3-force in & and subsequently in all other LCF via 
the appropriate Lorentz transformation? Furthermore the transformed 3-force in the 
other LCF is conservative and, if it is, does there exist a transformation law between 
the two 3-potentials? 


7The potential is a relativistic physical quantity, therefore must be expressed in terms of a Lorentz 
tensor. 


364 11 Four-Force 


In order to answer these questions we note that the Lorentz transformation of the 
3-force does not necessarily preserve the conservative character of the force, that is, 
in general the transformed 3-force will not be conservative. However, if we manage 
to define for the 3-force f a four-potential in & then this will give a four-potential in 
any other LCF via the Lorentz transformation of the first. This is shown clearly by 
the following example. 


Example 11.3.1 Show that a 3-force which is central in the LCF © is not (in 
general) central in another LCF &’. Because the central forces are conservative 
conclude that the property of a 3-force to be conservative is not (in general) Lorentz 
covariant. 
Solution 

Let f(r) = f(r)r be a central force in the LCF © where r = rf is the position 
vector in &. As we know from Newtonian Mechanics, this force is conservative with 
potential function U(r) = — ff-dr = — f{ f(r)dr. Let P be a ReMaP which in © 
has velocity v, position vector r while moves under the action of the 3-force f(r). 
Then the four-force on P in & is: 


Bax 7) ; 
x 


Let &’ be another LCF in which P has velocity v’ and position vector r’ while is 
moving under the action of the same 3-force, which in &’ we denote by f". In order 
to compute f’, we consider the Lorentz transformation of the four-vector F! from D 
to X’ and have ((Q = 1 — ony see Exercise 11.2.3): 


f =— [ror+[o DSS -Sroe-w]al 
-= r+[ou- D> - *en]ur (ru - cr vyul. 


The first two terms in brackets equal r’ and the quantity ct — ™ 


dt is the proper time of P. Finally, we find: 


f= LO (v + am), (11.43) 


Uv 


= +dt where 
Yo 


Obviously the 3-force f’ is not central (due to the term LOU) in D’ hence, in 
general, f’ is not conservative. 


Exercise 11.3.1. In Example 11.3.1 calculate V’ x f' and find the conditions for 
which the 3-force f’ is conservative. 


11.3 Inertial Four-Force and Four-Potential 365 
11.3.1 The Vector Four-Potential 


Consider an LCF & in which there exists a conservative 3-force field f with 
potential® @(/,r) (1 = ct) so that f = —qV. We define the four-potential of the 
four-force F; which in © is defined by the 3-force f, to be the timelike four-vector 
®; which 


a. In & has components: 
®; = (¢/c, 0, 0, 0) (11.44) 

b. Satisfies the condition: 
F; = q(®i,j — ®;,)v'. (11.45) 


where q is a constant. 


Before we continue, we have to check that this definition of the four-potential is 


compatible with the decomposition of the four-force in X, that is, that the definition 
1 


(11.45) leads to the expression’ F! = y ( - ’) provided ®; is given by (11.44). 
x 


v 
We compute the components of the four-force defined by f by considering the 


In order to show this, we write f = —V@ and assume that in D, v! = y (<) ‘ 
= 


definition (11.45). For the temporal component we have!?: 
Fo = q(®o,j — ®j.0)v! = (Pou — yoyo" = qPo,nyv" (11.46) 
1 
= IyVg.y =—-yf-v. (11.47) 
c c 


Similarly, for the spatial components we have: 
2 Siva oS eee ef 
Fu = qu, j — ® jy) v" = —qPo,,v = 2 P ure =yf 


which completes the proof. 

Concerning the physical meaning of the four-potential, we note that it is a four- 
vector which is determined by the motion of P, therefore must be expressible in 
terms of the invariants associated with the ReMaP P. The only such quantity (in 


8The scalar potential #(r, t) corresponds to the 3-force f not the four-force F’ defined by f! As we 
have shown, there does not exist an invariant four-potential. 


°One could work in the opposite direction, that is consider the decomposition of the four-force in 
= and derive the definition (11.45). 


0Care! Fy = —F°. 


366 11 Four-Force 


the absence of dynamical fields) is the mass of P. Therefore, there must exist a 
relation between the mass m of P, which is moving under the action of the inertial 
four-force, and the four-potential of the four-force. Obviously, this relation must be 
sought in the conservation of energy. 

We consider the condition F‘v; = 0 which gives in D: 


E=f-v> 
dE =f-.dr=-—qV$o¢-dr = —qdo 
hence in U: 
E + q¢@ = constant. 


But E = T + mc? where T is the kinetic energy of P in ©. Hence: 


T + q@ = constant — mc’. 


The lhs is the same with the corresponding Newtonian expression whereas the rhs 
differs by the term mc?, which indicates the different dynamics of the two theories. 
Indeed, the common appearance of this term with the quantities T and q¢@ indicates 
that the mass m of P can change by the kinetic energy and/or the potential energy 
of the dynamical field, which modulates the motion. Therefore, if EF; = M 1c? and 
E> = Moc? (M; = ym; where i = 1, 2) are the energies of P at the events 1, 2 of 
its worldline and ¢1, ¢2 the corresponding values of the potential of the 3-force in 
, then: 


(Mo — M1)c* = q(d1 — $2) 


that is, the energy of the field changes the “inertial” mass of P and conversely. 


11.4 The Lagrangian Formalism for Inertial Four-Forces 


The Lagrangian and the Hamiltonian formulation of dynamics of Special Relativity 
is equally important as it is in Newtonian Physics. However, it presents “peculiar- 
ities” and it is more obscure due to the hyperbolic geometry of spacetime and the 
possibility of null four-vectors. In the following, we shall discuss only the rudiments 
of these formalisms and advise the interested reader to consult books whose subject 
is the relativistic field theory. 

The main role of Lagrange equations is to produce the equations of motion of 
a ReMaP moving under the action of an inertial dynamical field. The Lagrange 
equations involve a scalar (not invariant!) function, the Lagrangian and the equations 


11.4 The Lagrangian Formalism for Inertial Four-Forces 367 


of motion (for conservative time independent (autonomous) dynamical fields) have 
the well known form: 


dé dx' ~— ax! 

where a dot over a symbol indicates derivation along the worldline of the ReMaP 
which is parameterized with the parameter 0. The function L cannot be arbitrary. 
Indeed, the equations of motion (11.48) must be covariant wrt the Lorentz group (see 
Covariance Principle in Sect. 4.8) therefore, the quantity Ld6@ must be a (Lorentz) 
invariant. This requirement restricts L depending on the type of the parameter 6. A 
significant difference from Newtonian Physics is that 0 need not be the time but it 
can be any (increasing and smooth) parameter along the worldline of the ReMaP. 
From the variational calculus it is known that equations (11.48) follow from the 
stationary integral: 


5 f Lao = 0. (11.49) 


In case @ is the time in a LCF © then the integral { Ldt is the action integral and 
the Lagrangian function L is called the relativistic Lagrangian of the ReMaP in &. 

We start our study by computing the Lagrangian of a free relativistic particle of 
mass m and four-velocity u’. In Newtonian Physics the Lagrangian Lo, for a free 
Newtonian particle is the kinetic energy: 


Lon =2T= mv- 


and the Principle of Least Action!! is: 


8 [ Lovat =0 


where ¢ is time along the orbit of the moving particle in space. This leads us to 
consider! in Special Relativity as the Lagrangian of the free particle the Lagrangian 
Lom: 


Lo.m = m(uluj) = —mc? (11.50) 


‘Tt is not always Least. The correct is to say Extremal Action. But this is what has prevailed in the 
literature and we shall follow it. 

As a matter of fact this choice is unique, because m and the proper time are the only invariants 
associated with a free particle. 


368 11 Four-Force 


and write for the (relativistic) Principle of Least Action: 


2 
af Lo.mdt =0 (11.51) 
1 


where |, 2 are events along the worldline of the particle and t is the proper time 
of the particle. Equation (11.51) does not apply to photons, because photons do not 
have proper time. However the formalism can be extended to include the photons, by 
considering a different parameter along the null worldline of a photon. In order the 
function Lo, y to be accepted must lead to the equation of motion of a free particle: 


dp' 
dt 


=0. (11.52) 


As we assume the four-force to be inertial (F'u; = 0) the mass m of the particle is 
constant, hence equation (11.52) is written: 


a Gee, (11.53) 
dt = dt2 : 
and has solution: 
xi =u't +B’. (11.54) 


This is consistent with the assumption that the worldline of a free particle is a 
timelike straight line. 

Let us examine the Lagrangian we considered in more detail. Replacing Loy in 
(11.51) we find: 


2 2 
— mes f dr=0— 98 [ dt =0. (11.55) 
1 1 


Equation (11.55) is the equation of geodesics, which in Minkowski space are indeed 
the straight lines (11.54). By definition, the world lines of photons are null straight 
lines in Minkowski space, therefore, the formalism we developed is ample and 
consistent with the assumptions we have made so far. 

It remains to prove that the variation (11.51) leads to the equation of motion 
(11.52) when Ld@ = Lo ydt and @ is a smooth function of t such that the 


derivative a between two events 1,2 of the worldline of the particle does not 


change sign. Without restriction of generality we assume ae > 0. 

The necessity of consideration of the parameter 9 comes from the necessity to 
cover the case of photons, which do not have proper time as well as the need to 
compute the equations of motion in an LCF other than the proper frame of the 
particle. For example, if we consider 6 = t where ¢ is time in &, then the equations 


11.4 The Lagrangian Formalism for Inertial Four-Forces 369 


which follow from the variation of the action integral are the equations of motion in 
X. With the introduction of the parameter 6 the Lagrangian becomes: 


: : dx! dxJ 
Lom = mu! u; = —mcy —u'u; == —mc Nij —— 
dt dt 

- _ dx! dx/ do 

~ MN 99 do dt 


Lomdt = —me,/—nj jx! xi dO (11.56) 


where x; = aaa The Principle of Least Action reads for the parameter 0: 


hence: 


af Ado =0 (11.57) 


where: 


A= —me,f —nijxixs. (11.58) 


In the following, we show that indeed the Lagrangian A leads to the correct 
equations of motion. 

Consider an inertial four-force field F’ which is described by the four-vector 
potential ©; according to the relation: 


Fy 28 @p) 0 p= (11.59) 
— i,j — Pj ix? — : 

L 1,J Jot dt 

where k is a constant. As we have said, the equation of motion of a ReMaP P of 
mass m and four-velocity u' in the four-force field F’ is: 


dp! 
an (11.60) 
dt 

where p! = mu! and Tt is the proper time of P. This relation in terms of the 


parameter 0 is written as follows: 


dp' do 
dé dt 


., dd 
= Fi =k(9;,; — Oj% => 


dp! a 
Wo = k(9j,; — O,i)x/. (11.61) 


370 11 Four-Force 


Equation (11.61) must follow from Lagrange equations for the Lagrangian A, that 
is from the equations: 


ddA OA 
——- = —— = 0. 11.62 
dé ax' ~— ox! ° ae) 


The Lagrangian A besides the term —mc,/—njj;x!'X/ which is the Lagrangian of the 
free particle, must have a second term which will take into account the 4-potential 
of the field F’. In analogy with Newtonian Physics we define the relativistic 


Lagrangian as follows: 
A =—me,/—nijxixi — k&;x! (11.63) 
and compute the Lagrange equations (11.62). We set for convenience A = 


af —nijx'x/ and get: 


JA aA 
ae a 
But: 
dA ol. a; 
a6 UA 
hence: 
- = me nije k®; 
The derivative: 
ee - me nit! 3! ko! = — ij LAx! — Ax/] —ko!. 
The term: 
, dv de. 
4g ae 
so that: 
ddA _ mc 


11.4 The Lagrangian Formalism for Inertial Four-Forces 371 


We also compute: 


JA 
axi 


= —ko; ;x/. 


Replacing in (11.62) we find: 


mij (Ax _ Ax!) _ k@;, jx/ + k@ ; 5x4 =0 


or: 


mc 


Mi (Ax! = Ai/) = k(-0;; + %;,;) (11.64) 


In terms of the four-force equation (11.64) is written as follows: 


m ej ij 

aM (A¥ ae ) = (11.65) 
Let us consider various cases for the parameter 6. The choice 9 = Tt gives A = c 
and equations (11.65) read: 


d*xi 
which is the correct equation of motion. For a free particle F; = O and equa- 


tion (11.65) becomes aa = 0, which is the expected result. 


In order to compute the equation of motion of the ReMaP in the LCF © in which 
the time is f we set 0 = t and the Lagrangian A reads: 


—_ dx! dxi ne dx! 
Be ae dt "dt 
dx' dxi dt dx! dt 
= —mc Nij k®; 
dt dt dt dt dt 


1 ke j 
=—-mc:-c— — — Pj => 
yo 


he Se py, (11.67) 


We write the Lagrangian A» in terms of the components of the four-velocity in 
X. If we assume in &: 


0; = (¢/c, w),, “= (7) ; 
z 


372 11 Four-Force 


then: 
dju' = —by +yw-v=y(-$+w-y). 


Replacing in (11.67) we find: 


mc 
Ay = ——— —kc(-@+ w-Y). (11.68) 
y 


Exercise 11.4.1 
a. Let v? = dxf be the 3-velocity of a ReMaP P ina LCF =. Prove the identity: 


a 1 1 
—— a yy 11.69 
Dv? y ave ( ) 
[Hint: 525v? = 2v?]. 
b. Show that the 3-momentum p? of P in X can be written: 


2 
pe= ae (-“5) (11.70) 


where m = constant. 
[Hint: p® = myv® ]. 
c. Prove the relations: 


dp? _ d (dLo,> dLo.y 
dt dt \ dvP ax? 
dp°® _ dE*® _ dLo.s 
dt d(ct)  A(ct) 


2 " 
where Lo.y = aby and v® = a Finally show that: 


dp! d dLo,= dLo,> 
dt dt \ dx axi 


where x! = ea Conclude that the Lagrangian of a free particle in & is the 
2 
mC 


function Los = es 


Exercise 11.4.2 


a. Let 0 be a parameter along the worldline of a ReMaP P such that oe > 0 where 
t is the proper time of P. Show that the Lagrangian Lg which corresponds to the 
parameter 60 is related to the Lagrangian L, which corresponds to the parameter 


11.5 Motion in a Central Potential 373 
t, with the relation: 


L, = Lg— 11.71 
t FP ( ) 


[Hint: 6 f Led@ = 6 f Ldt]. 
b. Consider 0 = t where t is the time ina LCF & and show that: 


Ly+ =yLz.5 (172) 


where Ly+ is the Lagrangian in the proper frame of P and L;,> the Lagrangian 
in X. 

c. Let X, &’ be two LCF with parallel axes and relative velocity u. Prove that 
the Lagrangians of an inertial conservative four-vector force field in X, &X' are 
related as follows: 


Lyyy = Lyysy (11.73) 


where = Ls] yy = —+— and £, B' are the speeds of P in =, &' 


respectively. 
Making use of the relation yxy = Yuys Q, where Q = 1 — pon) prove that: 


dq! d 
Ly(q’, art) =F ple. au) (11.74) 


This relation gives the transformation law of the Lagrangian function under a 
Lorentz transformation. 


11.5 Motion in a Central Potential 


As an application of the Lagrange formalism in Special Relativity we consider 
the motion of a ReMaP P of mass m # 0 (not a photon!) in the LCF &, under 
the action of a central potential given by the function U(r) = U/c. Recall (see 
Example 11.3.1) that the motion of P in another LCF which moves wrt » is not in 
general central. Therefore, whatever results we derive in this section are valid in & 
only. If one wishes to find these results in another LCF, the &’ say, then the results 
must be transformed by the appropriate Lorentz transformation relating © and D’. 
According to the previous considerations the four-potential defined by the 3- 
potential U(r) in & is the timelike four-vector (U(r), 0, 0, 0). The motion in & is 
taking place on a plane as in Newtonian Mechanics. Indeed, in & the 3-force on 
PisF= ~ 6, therefore the angular momentum L of P wrt the origin of & is 


374 11 Four-Force 


constant: 


dL 
—=rxF=0. 
dt 


In addition L - r = 0, that is the position vector r is always normal to the constant 
vector L consequently the orbit of P in & is on a plane. 

The Lagrangian of P equals the sum of the Lagrangian Lo(t) = —mc* where t 
is the proper time of P and the potential function U(r). In order to express Lo in 
terms of the parameter ¢ the time in &, we use the relation: 

Lo(t)dt = Lo(t)cdt 


from which follows (dty = cdt): 
1 3 
Lo(t) = ——mc~. (11.75) 
Y 


Consequently the Lagrangian of P in D is): 


L300) =— 2 — Ur). (11.76) 
y 


2 
We consider polar coordinates r, @ so that the speed v7 = i? +r76 and the final 
form of the Lagrangian reads: 


Pape 
Pe ee er), (11.77) 


Ly(r, t) = —mc 5 
(6 


The angle @ is a cyclic coordinate therefore, the generalized momentum Pg = L, is 
a constant of motion. We compute: 


L, = — =myr’6. (11.78) 
The Lagrange equation for the r—coordinate is: 


d aL aL _ 4 
dt dr or 


'3This Lagrangian follows directly from (11.67) if we change Ly > Ay. 


11.5 Motion in a Central Potential 375 


from which follows: 


SG yee ba OZ0 (11.79) 
a) r)- + ai ==—— =, : 
dt ¢ . m dr 

Because the Lagrangian is independent of time (and there are not non-holonomic 
constraints) the energy is a constant of motion or, equivalently a first integral of 
(11.79). This implies FE + cU = C where C is a constant. Solving in terms of y we 
find: 


C—cU 


myc +cU(r)=C>y= a (11.80) 
The speed of the particle can be written: 
2 2 2 
Yar4r6 = (=) 646 = (%5) + P| nia 
from which follows: 
se a ( : a a aaa 
m r- do r 


Replacing v*, v*y* in the identity 1 + y*fB* = y7, we get the equation of the 
trajectory of the particle in U: 


L2 ldr\? 1 Cac? 
1 z — . 11.82 
- m2c2 ((: a) 7 2 | ( mc2 ) Cth?) 


We introduce the new variable u = : and equation (11.82) is written as follows: 


L2 du\* C=cu\? 
zg —_ 2 — 
1+ eo) (<i) +u ( a2 ) F (11.83) 


Differentiating this relation wrt u we find: 


2L? du [du 2 ducdU C—cU 
caida | pce enr fat ee 
m2c2 do | do? mc? d@ du mc? 
We assume ae # 0 (that is the points at which the function r has extreme values 


are excluded) and we have the final form of the equation of motion of the particle in 


376 11 Four-Force 


= for a general central potential U(r) ( U(r) is given in ¥!): 


du n 1 cdU C—cU 
— u= i 
d62 re du c2 


(11.84) 


Of historical interest is the study of the motion of an electron around the nucleus. 
In this case, the potential function in the proper frame of the nucleus is assumed to 


have the form cU(r) = — Ze where Z is the atomic number of the nucleus. For this 
potential function, the equation of the orbit (11.84) reads: 
d*u Ze4 Ze?C 
1 = : 11.85 
do?" ( op) “OT om 


: 2d a ; : : 
If we introduce the new constant A* = 1 — as this equation is written: 


au a Ze-C 
pat ic 


do? cL? 


(11.86) 


This equation describes a forced harmonic oscillation with proper frequency i 


2 é : 
“sf +. The solution is: 


and driving force 


1 Ze2C 


u= AcosAé+ Bsinrd + 32 ED 


(11.87) 


where the constants A, B are determined from the initial conditions. We assume that 
6 = 0 when the electron is at the perihelion (see Fig. 11.3) and have: 


d d 
(< -e(#) sos (5) = 
dé} 9=0 dé} 5-0 dé) 9=0 


Fig. 11.3 Specifying the y 


coordinates r, 0 directrix 


> 
x 


11.5 Motion in a Central Potential 377 


Fig. 11.4 Precession of the A 
orbit 


Then (11.87) gives B = 0 and the general solution reads: 


ay eer rae aca (11.88) 
r Ne ol 
We infer that in the proper frame of the nucleus &, the orbit of the electron is an 
ellipse with one focal point at the origin of & and eccentricity ae The constant 
a is the distance of the directrix from the focal point (see Fig. 11.3). 
The Newtonian limit of the orbit occurs when c — oo that is, A = 1. This 


orbit is the well known circular orbit assumed by N. Bohr in his model of the atom. 
For finite values of c, A # 1 and the orbit is precessing between two enveloping 
circles, which are defined by the maximum and the minimum values of r, as shown 
in Fig. 11.4. The precession is due to the fact that after a change of 0 by 27 the 
factor 46 does not change 27 except iff A = 1,which is the Newtonian orbit. After 
a complete rotation the ellipsis “closes” by an angle 27 + A@ where per complete 
rotation Aé is defined by the relation: 


1-A 
(20 + AO) = 27 => St 


Because | — A deviates form | by terms of the order a its value is close to 1, hence 
the deviation A@ per rotation is very small. Kinematically this means that the orbit 
precesses along the enveloping circles with very small angular velocity. 
Sommerfeld used the precession of the relativistic orbit in the old Quantum 
Theory in order to determine the quantum conditions for the energy levels and 
the probability for the transition between stable orbits of the hydrogen atom. As 
it is well known, in the old Quantum Theory for every periodicity of a stable 
classical orbit, there corresponds a quantum number. For example Bohr had used 
the periodicity of the Newtonian orbit in order to introduce the quantum number 
(=quantize) of angular momentum and, based on that, to compute the energy levels 
of the hydrogen atom. By using the relativistic orbit, Sommerfeld calculated in 
addition the fine energy levels of the energy spectrum of the hydrogen atom. 


378 11 Four-Force 


If we assume that the angular momentum L, is quantized in multiples of i = ft 


then we find that the minimum value of L, = fi gives the minimum value of 1, 
which is: 


2 
Ze 
2 Dd 
where 
py 2972 x 1073 = —! 
fie? ~ 134,04 


is a universal constant known as the fine structure constant. 

The constant a determines the strength of the electromagnetic interaction among 
the fundamental particles by relating the quantum of the electric charge (e) with 
the quantum of the angular momentum /fi and the speed of light c. The deviation 


Of Amin from 1 is of the order of Cas For Hz (Z = 1) hence the quotient ia = 
5, 6x 1075, which can be observed in spectroscopy The fine lines in the spectrum of 
hydrogen where observed and that was one of the most accurate tests of relativistic 
dynamics. 

However, when the dynamics of Special Relativity is applied to the gravitational 
field fails to justify the observational results. 

Let us consider the gravitational field of the sun cU(r) = —oM Then all 
previous relations hold if one replaces Ze” with GMm where m is the mass of the 
particle which moves under the action of the gravitational field. If we compute from 
these relations the precession of Mercury we find that it precesses 7” per century, 
while the measured value (with accuracy 5” ) is 43” per century. Therefore, when 
dealing with the gravitational field we should apply the dynamics of General and 
not of Special Relativity. '+ 


11.6 Motion of a Rocket 


The motion of a rocket is a motion in which the proper mass changes. In the study 
of this motion, the rocket is considered to be a ReMaP with proper time tT, four- 
velocity u', four-acceleration a’ whose mass is changing at a rate dm/dt e.g. by 
the emission of exhaust gases. The physical variables are the ones which can be 
measured by the observer inside the rocket (the proper frame). These are: 


e The rate of reduction of (proper) mass dm/dt (an invariant) and 
¢ The relative speed of emitted gases w’. 


'4For a different view of this topic see T: E. Philips Jr., “Mercury precession according to Special 
Relativity” Am J Phys (1986), 54 pp. 245-247. 


11.6 Motion of a Rocket 379 


In the following, we assume one dimensional motion so that the velocity of 
the exhausted gases is antiparallel to the (constant) direction of the velocity of the 
rocket. To determine the equation of motion of the rocket we consider the change of 
the four-momentum between the proper time moments t and t + dt. Suppose that 
the proper moment t the mass of the rocket is m(t), its four-velocity u'(t) and that 
the proper moment t + dt these quantities are (assuming a continuous variation and 
dm > 0)m(t +dt)=m(t)—dm and u!(t+dt) =u'(t) + du’. 

Let dm’ the mass of the gases (we do not have conservation of mass!) and let 
w! their four-velocity. Conservation of four-momentum of the system between the 
proper time moments t and t + dT gives: 


mu! = (m —dm)(u' + du') + dm'w! = mdu! — dmu' + dm'w' =0 (11.89) 


where we have ignored the term dmdu'. 

This equation is covariant, that is, it is valid in all LCF. We write it in the LCF 
=T of the IRIO of the rocket and compute it in any other LCF using the appropriate 
Lorentz transformation. The u'uj; = —c? => u'du; = 0 that is, du!’ is a spacelike 
four-vector. 

Let e be the unit vector along the direction of motion of the rocket. Then for the 
four-vectors involved we have the following decomposition: 


L(y Habe A(R) 
0/54 du'e} 54 —y(w')w'e/] 5. 


where du’ is the change in the four-velocity of the rocket during the period t and 
t + dt. Replacing in (11.89) we find the following two equations: 


1 
dm’ = dm (11.90) 
y(w’) 
du’ ; 
m—=wW. (11.91) 
dm 


Equation (11.90) gives the mass of the exhaust gases in terms of the diminution 
of the mass of the rocket and equation (11.91) is the equation of motion of the rocket 
in its proper frame D*.!> 

We determine the equation of motion of the rocket in a LCF & (the Earth say) in 
which the rocket at the moments f, ¢ + dt has velocities u and u + du respectively. 


'SEquation (11.90) might give the impression that it is wrong because it relates two invariants, the 
masses dm and dm’ with the scalar factor y (w’)! However, this view is not correct and it is due to 
Newtonian ideas. Indeed, the observer inside the rocket does not “know” (that is, cannot measure) 
the mass of the emitted gases, therefore the quantity dm’ refers to the energy of the gases of mass 
dm and the velocity w’ wrt the rocket. 


380 11 Four-Force 


To do that we write the decomposition of the four-velocity of the rocket in © and 
ut: 


i _ y(du')c f - y(u+du)c 
ean Aca ee a Gee Ge). 


and apply the boost with velocity u: 


y(du!) = yw) (yu + du) - Syl + du)(u + du)) 
y(du')du’ = y(u) (v(u+ du)(u + du) — uy(u+du)) = y(u)y(u + du)du. 
Replacing y (du’) from the first and solving the second in terms of du’ we find: 
du’ = y*(u)du. (11.92) 


Replacing du’ in (11.91) we find the equation of motion of the rocket in = to be: 
d 
my?(u)—— = w'. (11.93) 
dm 


Equation (11.93) makes possible for the observer in the rocket, to be able to 
determine his motion in & by looking at the indication (a) of the instrument which 
measures dm /dt, that is, the rate of consumption of fuel and (b) a second instrument 
which measures the relative velocity of the exhausted gases w’ wrt the rocket. 
Indeed, if these two quantities are known then by integrating equation (11.93) the 
traveler is able to compute the speed and the position of the rocket in &. If we 
consider an approximation of the order O(B) then equation (11.93) reduces to the 
corresponding equation of Newtonian Physics, as it is expected. 

In order to solve the equation of motion (11.93) we have to make certain 
assumptions, which have a sound physical basis. We assume w’ = constant, which 
means that the engines of the rocket work steadily. Furthermore we consider the 
initial conditions m(0O) = M, u(O) = uo. Then equation (11.93) gives: 


“du m idm 
— = w’ — 
uo _ az M m 


from which follows: 


c aes i> m\w' 
5[in(G=) nas) |= in(—) 


Solving for u(t) we find: 


c (11.94) 


11.6 Motion of a Rocket 381 


Fig. 11.5 Newtonian 1.03 

(B = 0.1) and relativistic ae 5 

B = 0.4) motion of th a0 ‘ ae 
eae Ene : Newtonian limit 


0.0 


140 ' et 
where A = ao and B = au For initial speeds up < c the term ug/c can be 


ignored (A = 1) and the solution (11.94) is written: 


u(t) = tn), (11.95) 
1+ (% 


We note that always u < c and u — c only when the quotient m/M — 0. 
Furthermore, the rate of change of the mass dm/dt does not enter the calculations. 
For ug/c < 1 (and also for m/M = 1) we recover again the well known Newtonian 
result. In general, the Newtonian behavior appears when the parameter (2c <1. 
This happens at two phases of the motion: 


e At the beginning of the motion, when m/M = | (and small velocities) 
¢ When 2w’/c < 1 approximately for all the duration of motion. 


Practically, we have Newtonian behavior for w’/c = 0.1. In Fig. 11.5 we display 
the quotient u/c as a function of the ratio m/M and for the values A = | and 
B = 0.1 (Newtonian limit) and B = 0.4 (relativistic limit). 

[Question: What one can say for photonic fuel, that is fuel for which the 
exhausted gases are photons? What is the mass dm’ in that case?]. 


Example 11.6.1 A spacecraft departs from the Earth and moves along a constant 
direction towards the center of galaxy, which is a distance of 30,000 ly as measured 
from the Earth. The engines of the spacecraft work steadily and create a constant 
proper acceleration g for the first half of the trip and a constant retardation g for the 
return trip. 


1. How long will the trip last according to the proper clock of the spacecraft? 

2. What distance has the spacecraft covered as estimated by the crew in the 
spacecraft? 

3. What fraction of the initial mass of the spacecraft will be used if we assume 
that the engines of the spacecraft transform the mass of the spacecraft into 
photons (photonic fuel) with an efficiency 100% and the radiation is emitted in 
the opposite direction to the direction of the velocity of the spacecraft? Assume 
c=1. 


382 11 Four-Force 


Solution 


1. Assume that the x’—axis joins the Earth with the center of galaxy (this is possible 
because the 3-d space in Special Relativity is flat, i.e. has zero curvature). Then 
y = z = 0 during all the duration of the trip. For the four-velocity and the four- 
acceleration of the rocket in the frame & of the Earth we have the decomposition: 


The orthogonality relation: 
via, =0 => —a°u® +a'ul =05 a9 = du', a! = Au? 


where A ¥ 0 is a constant. It is given that a’a; = g?. Because the length of a 
four-vector is invariant we compute it in and find: 


-@Y+@yYae siateg 


where A = +g is for the first part of the trip and 4 = —g for the return trip back 
to Earth. Finally, for the departure trip we have the equations of motion: 


du® 
a" 
du! 
ag en 


: aut ‘ 1 
Differentiating the first wrt t and replacing de from the second we find: 


i Siti ee 
dt? dt 


The solution of this equation is: 
u° = Asinh gt + Bcoshgt 
where A, B are constants. In a similar manner we compute: 
u! = A’ sinh gt + B’ cosh gt. 


The initial conditions are: 


du! 
=t=0, l@ =0, —(0)= 
t u (0) ea, g 


11.6 Motion of a Rocket 383 


from which follows: 
u'(t) = sinh gt, u°(t) = cosh gt. (11.96) 


Concerning the position we have from (11.96): 


Fi 0 
— =u (t) =coshgt 
dt 
dx dx dt 1 , ; 
= = —u (t)y = sinh gt. 
dt dtdt y 


Assuming the initial conditions t = t = x = 0 we end up with: 


1 1 
t(t) = —sinh gt, x(t) = —(cosh gt — 1) 
& & 


which is (as it should be expected!) the expression of hyperbolic motion (see 
Example 7.4.1). 

In order to get an idea of the result let us make some calculations. To do 
that we have to express g in units of distance and time. We note that l/y = 
lightyear ~ 9.5 x 10 m and l year © 3.15 x 10!’s. Moreover we have 


assumed c = 1 > 1s = 3 x 108m. Using these, we compute in units of 

distance: 

g = 10m/s” = 10x | x(9.5x 10> )ly7! & 1(ly)7! 
(3 x 108)2 (3 x 108)2° ~" 


Similarly in units of time we have: 


1 1 
g= 10 m/s” =10~x Ix 108 sos ie 10’)(years)~! ~al (years) !. 


Using these units we find that for the first part of the trip, that is, x = 150001y, 
the time: 


1 
t = —cosh !(gx + 1) > t = [(lyears)~!]~! cosh7![(y)~!(15000/y) + 1] 
& 
= cosh”! (15001)years = 10.31 years. 
Obviously for the whole trip it is required twice this time period, that is: 


Ttotal = 2T = 20.62 years. 


384 11 Four-Force 


The time period between the events of departure and arrival of the spacecraft for 
the observer on Earth is: 


1 
t = —sinh gt = sinh 20.62 = 4.5 x 10° years. 
g 


Observe the difference between ¢ and t and appreciate the well known twin 
paradox. 

2. In order to compute the distance covered by the spacecraft from the Earth, as 
estimated by the crew at the spacecraft, we consider coordinates (cT, X) in the 
proper frame of the spacecraft and apply the boost with velocity u!/y where u! 
is the x—component of the four-velocity. We find: 


ig! 

X=y (: - <1) (11.97) 
Y 
‘a 

T=y (: - ~x) ; (11.98) 
Y 

But: 
0 


y =u =coshgt. 


therefore (11.97) and (11.98) give: 


X= ly — cosh gt) = —x. 
& 
This result is important because it proves that the distance of the spacecraft from 
the Earth is the same as estimated either by the observer at the Earth or by the 
crew in the spacecraft. 

3. Let Mo be the initial mass of the spacecraft and M(t) the mass after proper 
time t. Then the energy E(t) = M(t)y(t) = M(t)u° and the 3-momentum 
P(t) = M@)y(t)u = M(t)u!. The proper moment t + dt the change 
of the energy and the momentum of the spacecraft, dE and dP respectively, 
equal the corresponding quantities dEphotons, @ Pphotons Of the emitted photons 
respectively. But for photons we have that d Ephotons = 4 Pphotons (C = 1!) hence 
we have the following equation of motion: 


-dP=dE>d(E+P)=0=> E+ P =constant. 


From the initial conditions E(0) = Mp, P(0O) = 0 we compute that the value of 
the constant is Mo, therefore we have finally: 


Mo 
u9 + y!° 


M(t)u° + M(t)u! = Mo > M(t) = 


11.6 Motion of a Rocket 385 


We replace u°, u! from (11.96) and get: 
M(t) = Moe ®*. 


For proper time Tiotal = 20.62 years we find that the percentage of the mass 
which has been used equals: 


M 
AM = (1 = a) x 100 © (1 — e296) = 99.999%, 
0 


Example 11.6.2. A particle of mass m and velocity u moves along the x —axis of the 
LCF &. 


1. Show that the 3-momentum and the energy of the particle are given by the 
relations P = mc sinh x, E = mc? cosh x where x is the rapidity of the particle 
in &. Deduce that the speed and the rapidity of the particle in & are related as 
follows u(t) = ctanh x. 

2. Consider a second LCF ©’ which moves wrt & in the standard configuration 
along the x—axis with speed v. Show that in X’ the 3-momentum P’, E’ and the 
energy E’of the particle are given by the following formulae: 


P' = me sinh(xy + W), E' = mc’ cosh(x + W) 


where tanhy = v/c. Explain the geometric meaning of this result. Assume that 
the speed u changes to u + du. Prove that: 


duy?(u) = cdx. 
If du’ is the corresponding change of the velocity in &’ show that: 
du” /(u’*(u) = cdx. 


3. Application. 

A rocket moves freely by exhausting gases with a constant rate ~ = dM/dt 
and velocity —w’é, where é is the (constant) direction of motion of the rocket. If 
the rocket starts from a space platform A with initial mass Mo and moves along 
the x—axis calculate the velocity u(t) of the rocket wrt the platform when its 
mass is M(t). 


Solution 


2. The speed uw and the rapidity x of a particle in a LCF © are related as follows 
u = ctanh x. Differentiating we find: 


1 
du = cd — 
cosh? x f 


1 
aya => y*(u)du = cdx. (11.99) 


386 11 Four-Force 


From the relativistic rule of composing velocities we have that the rapidity x’ of 
the particle in X’ is: 
x=xt+v 


where yw is the rapidity of &’ wrt X& (v = ctanhy). If v’ is the velocity of the 
particle in X’, then applying relation (11.99) for the velocity wu’ we find: 


y?(u’)dul = cdx' = cdx = y*(u)du. (11.100) 
because dw = 0. which means that the quantity y7(u)du is an invariant (but 
depends on the time ¢ in &). 

Application 


The equation of motion of the rocket in & is (see equation (11.93)): v 


du 
my*(u) =w 


where w’ is the speed of the exhausted gases wrt the rocket. In order to write this 
equation in terms of the rapidity x we use (11.99) and find: 


dm 
cdx = w'—. 
m 


The solution of this equation is: 


‘4 


1 M)* 
= = : 
X — Xo Mo 


From the initial conditions we have yo = tanh! 0 = 0, hence: 


Mo w'/c 
= In ( — : 11.101 
x n( 7) ( ) 
From the relation which gives the rapidity we find eX = a Replacing in 
(11.101): 
_ 1 =(M/Mo)’? 


"= TE [Moy 


11.7 The Frenet-Serret Frame in Minkowski Space 387 
11.7 The Frenet-Serret Frame in Minkowski Space 


In classical vector calculus the Frenet-Serret equations define at each point along 
a smooth curve an orthonormal basis — or a “moving frame” — which is called 
the Frenet-Serret frame. The usefulness of the Frenet-Serret frame is that it 
characterizes and is characterized uniquely by the curve. In this section, we define 
the Frenet-Serret frame along a curve in Minkowski space and demonstrate its 
application in the definition of a generic four-force vector. It is apparent that in 
Minkowski space the equations defining the Frenet-Serret frame along a given curve 
must be four and the orthonormal tetrad defining the frame will involve Lorentz 
orthogonal vectors. 

The first difference between the Euclidian and the Lorentzian Frenet-Serret frame 
is that in the first case one has one type of curves whereas in the second one has 
three types of curves, the timelike, the spacelike and the null.!° Furthermore, the 
null curves are degenerate in the sense that their tangent vector is also normal to the 
curve. Therefore, the Frenet-Serret frame applies only to non-null (i.e. timelike and 
spacelike) curves. 

Let c!(r) be a smooth non-null curve in Minkowski space and let r be an affine 
parameter!’ so that the tangent vector to the curve is A! = dc! /dr: 


A' A; = €(A) (11.102) 
(A) being the sign of A! that is, e(A) = +1/ — 1 if A! is spacelike/timelike 
respectively. 


In the following, we shall denote the covariant differentiation along A’ with a 
dot, e.g.: 


Piet ae Piet At. (11.103) 


Jteeds ~~ Sleds; 


Because A! is unit A‘ A; = 0. 
Let B’ be the unit vector along A’ so that: 


Ai = bB' (b > 0). (11.104) 
Then: 


B'A; = 0, B'B; = «(B). (11.105) 


!6Besides these curves there are infinite other curves in Minkowski space which, however, do not 
interest us in relativity. 

‘Recall that a parameter along a non-null parameterized curve is called affine if the tangent vector 
along the curve (for this parametrization) has unit length i.e. +1. If r is an affine parameter then 
ar + b where a, be R is also an affine parameter. 


388 11 Four-Force 


We consider the derivative B' and decompose it parallel and normal to the plane 
defined by A’, B’. We write: 


Bi =aA'+cC! (11.106) 
where a, c are coefficients and C! is a unit vector normal to both A‘ and B! and 
so that the quantity nijkA! BiC* > 0 (this defines the positive orientation of the 
frame). Then we have: 

C'C; = e(C) (11.107) 
and 


A'C; = B'C; = 0. (11.108) 


In order to compute the coefficients a, c we differentiate (11.105) and use (11.104) 
to get: 


Bi A; = —B' A; = —be(B). 
But from (11.106) we have: 
B' A; = ae(A) > a = —e(A)e(B)b. (11.109) 
Replacing in (11.106) we obtain finally: 
B' = —e(A)e(B)bA! + cC!. (11.110) 
We apply the same procedure to the unit vector C! and write: 
Ci = BA'+yB'+dD' (11.111) 


where D! is the unit normal to the trihedral defined by the four-vectors A, B',C!, 
1.€.: 


D'D; = «(D) (11.112) 
and 


A! D; = B'D; = C'D; = 0. (11.113) 


In order to compute the coefficients 6, y we contract (11.111) with A’, B’ and use 
(11.108) to find: 


Be(A) = C'A; = —C' A; = —C'bB; =0 


11.7 The Frenet-Serret Frame in Minkowski Space 389 


and 
yeé(B) = C'B; = —C'B; = —e(C)c. 
Hence: 
B=0, y =—€(B)e(C)e 
and (11.111) is written as follows: 
C! = —e(B)e(C)cB! + dD’. (11.114) 


We decompose similarly Di = 5A'+€B'+nC! and making use of (11.113) we 
find 6 = € = Oandn = —de(C)e(D), so that!®: 


D! = —e(C)e(D)dC'. (11.115) 


From the above analysis we reach at the following conclusion: 


Proposition 11.7.1 At every point along a timelike smooth curve in Minkowski 
space, affinely parameterized with the parameter r=arc length it is possible to 
construct an orthonormal tetrad A‘, B',C', D' such that the four-vector A! is 
tangent to the curve and the rest three, mutually perpendicular, four-vectors 
B', C', D! are determined from the solution of the following system of differential 
equations: 


Ai = bB' (11.116) 
Bi = —e(A)e(B)bA! + cC! (11.117) 
C! = —«(B)e(C)cB' + dD! (11.118) 
D! = —e(C)e(D)dC! (11.119) 


where a dot over a symbol indicates differentiation wrt r. 


The four-vectors B', C', D‘ we call the first, second and third normal respec- 
tively of the curve c’ and the parameters b, c, d the first, second and third curvature 


'8The four-vector D! is determined uniquely from the “outer product” of the three four-vectors 
A’, B',C! i.e. D) = mig A BKC! where nied = J= 8g and ¢//* is the Levi-Civita symbol 
(€ijxi = +1 if ijkl is an even permutation of 0123 and €; jx; = —1 otherwise). This implies that the 
parameter d is possible to be expressed in terms of the parameters a, b, c and their derivatives. 
Therefore, in Minkowski space we have three four-vectors and three parameters associated 
uniquely with a given smooth curve and the associated Frenet-Serret frame. The orientation of 
this frame is positive or negative according the sign of the determinant of the 4 x 4 matrix defined 
by the four-vectors A’, B', C', D‘, or equivalently from the sign of the quantity 77 jx, s AJ BEC! DS, 


390 11 Four-Force 


of c'. The geometric significance of the Frenet-Serret curvatures is that they define 
uniquely the curve c! (for given initial conditions).'!° If we denote the curvatures 
b,c, d of the curve by 1, k2, «3 respectively the Serret - Frenet relations are written 
in the following matrix notation: 


Ai A 
Bi Bi 
Gi =§ ci (11.120) 
Di D' 
where the 4 x 4 matrix: 
0 KI 0 0 
ee —e(A)e(B)ky 0 K2 0 (11.121) 
0 —e(B)e(C)k2 0 K3 
0 0 —e(C)e(D)k3 0 


In case the curve c! is timelike e(A) = —1,e(B) = e(C) = e(D) = +1 and 
equations (11.116), (11.117), (11.118) and (11.119) become: 


Ai =x,B! (11.122) 
Bi=xK Ai +nc (11.123) 
Ci = —KB! +«3D! (11.124) 
Di = —«3C!. (11.125) 
Consequently the matrix S: 
0 «xk 0 0 


(11.126) 


Incase c! is a spacelike curve then e(A) = 1, e(B) = —1 (say) e(C) = e(D) = 1 
and relations (11.116), (11.117), (11.118) and (11.119) are: 


Ai =x 1B! (11.127) 


'9We note that the same relations hold in the case of General Relativity with the difference that 
the partial derivative is replaced with the covariant Riemannian derivative. Moreover if we set 
€(A) = 1, D' = 0 we recover the Frenet-Serret formulae of the Euclidian 3-d geometry. 


11.7 The Frenet-Serret Frame in Minkowski Space 391 


Bi=KA' +eC! (11.128) 
C= KB! + «3D! (11.129) 
D! = —x3C! (11.130) 


Exercise 11.7.1 Prove that the Frenet-Serret coefficients can be written in covari- 
ant form as follows: 


K = AB kp = BIC; 3 = C' Dj. (11.131) 


Deduce that the curvatures b,c,d are invariants — being inner products of four- 
vectors- therefore characterize a curve in an intrinsic manner (that is, independent 
of the choice of coordinate system). As we have already remarked, it is this result 
which makes the Frenet-Serret formalism a useful geometric tool in the study of 
worldlines. 


Exercise 11.7.2. The straight line in Minkowski space is defined by the condition 
A’ =constant. Use the Frenet-Serret equations (11.116), (11.117), (11.118) and 
(11.119) to show that for a straight line the curvatures kj = k2 = k3 = 0. 
Conversely using the values k, = kz = kK3 = 0 in the same equations prove that the 
curve they describe is a straight line. This result makes possible the definition of the 
straight line in a algebraic and covariant manner. 


In order to show how one computes the Frenet-Serret basis along a world line 
(i.e. a timelike curve) we consider a ReMaP P, which starts to move from the origin 
of an LCF © along the x—axis with constant proper acceleration a > O (hyperbolic 
motion). As we have shown [see equation (7.48)] the worldline of P is (c = 1): 


1 

x = —(cosh(at) — 1) y=0, z=0 
a 

where t is the proper time of P (related to the time ¢ of & with the relation 


_—— ! (sinh(at)). The four-vector A’ is the four-velocity u! with components 


a 
ui = (v9, u!, 0, 0)S where uw? = cosh(at),u! = sinh(at). The four-vector Ai 
is the four-acceleration of P, which in & is given by a= (a, a',0, 0S where 
a® = asinh(at) = au',a! = acosh(at) = au°. The length of the four- 
acceleration is a, hence the unit vector along the direction of a', which is also the 
first normal to the curve, is B! = (u!, u, 0, 0)s. In order to compute the second 


normal to the curve we differentiate B! and find: 
Bi = (a', w°, 0, 0) = (au®, au', 0, 0) = au’. 


From (11.123) follows k2 = 0. For this value of kz equations (11.124) and (11.125) 
are independent of the remaining two and are satisfied by infinitely many pairs of 
four-vectors C', D’. This is due to the fact that the motion takes place in the plane 
x — ct, which can be embedded in infinitely many ways into the four-dimensional 
Minkowski space. 


392 11 Four-Force 
11.7.1 The Physical Basis 


The Frenet-Serret frame has a special physical significance in the case of timelike 
curves. Indeed in this case the vector A! is the four-velocity of the ReMaP P, the 
first normal is the direction of the four-acceleration while the first curvature «1 is the 
measure of the four-acceleration. Therefore, the Frenet-Serret frame covers the basic 
physical quantities of Kinematics. There still remains a part which contains higher 
than the first derivatives of the four-velocity. This extra part must be related to the 
dynamics of motion, that is, the external four-force which modulates the motion 
of P. 

It is speculated that the four-force depends on the higher derivatives of the four- 
velocity when a charge accelerates. Indeed, it is assumed that an accelerating charge 
radiates an electromagnetic field which exerts a force back on the charge. This force 
is not a Lorentz force. Up to now, many formulae for this type of force have been 
proposed but it appears that “the force” has not yet been found. Although we do 
not know the actual expression of this type of force, nonetheless it is possible by 
the use of the Frenet-Serret frame, to give the generic expression for this four-force 
in terms of the “physical” kinematic quantities uw“, u“, ii, i". That is, we give a 
parameterized geometric expression which contains all possible 4-forces one could 
conceive. The role of Physics is to select the appropriate values of the parameters! 

In the following, we assume that the generic four-force is a pure four-force, that 
is a spacelike four-vector normal to the four-velocity. Summarizing we have to solve 
the following problem: 


Compute the generic form of a spacelike four-vector along a smooth non-null curve — which 
is not a straight line — in a basis which consists of the unit tangent to the curve and its higher 
derivatives, assuming that they do not vanish. We call this new (not in general orthonormal) 
basis the physical basis along the curve. 


This means that we are looking — if it exists! — for an expression of the form: 
r= fPx'uw,..du t+ fou a,..)a +--- (11.132) 
where r’ is an arbitrary four-vector (null or non-null) defined along the curve. 
Let c’ be a non-null, smooth, affinely parameterized curve in Minkowski space 
with unit tangent vector A’. From the Frenet -Serret relations we have: 
Ai, Ai =bBi (11.133) 
where: 


A'A; =e&(A), B'A; =0, B'B; = €(B). 


11.7 The Frenet-Serret Frame in Minkowski Space 393 
Differentiating twice A! along the curve we find””: 

Ai = —e(A)e(B)b7 A! + BB! + bcC'. (11.134) 
— [—3e(A)e(B) bb] A! + [b — &(A)e(B)b? — &(B)e(C)bc*|B! + [2cb + bée]C! + bed D'. 


(11.135) 


We conclude that in general the tangent and the Frenet-Serret bases are related as 
follows: 


Al Ai 
Ai i 
Ai =R (11.136) 
ai} be 
where the 4 x 4 matrix: 
1 0 0 0 
= 0 b 0 0) 
~ | —e(A)e(B)b? b be 0 


—3e(A)e(B)bb | b — €(A)e(B)b? — &(B)e(C)bc2 | bé +. 2ch | bed 


The matrix R defines a change of bases (not necessarily change of coordinates?!) if 
its determinant does not vanish. We compute: 


det(R) = b>c7d. (11.137) 


from which we infer that R defines a change of basis provided bcd # 0. 
Kinematically this means that the physical basis is possible only for non-planar 
accelerated motions (see Exercise 11.7.2). In case one of the curvatures b,c, d 
vanishes we can consider higher derivatives of A‘ until we obtain a basis along 
the curve. 

In the following we assume bcd # 0 and express the four-vectors of the Frenet- 
Serret tetrahedron in terms of the four-vectors of the physical basis using the matrix 


0The three four-vectors A‘, A‘, A! are sufficient (provided they are independent) in order to 
compute the fourth by the formula T' = ni; WA! A* A! . This means that, in general, the physical 
basis requires only up to the second derivative of the vector A’ along the curve. 

214 basis in a linear space does not necessarily follow from a system of coordinate functions, that 
is, the basis vectors cannot be written as tangent vectors to a system of coordinate lines. A criterion 
that a set of vectors is generated by a coordinate system is that the vectors commute. If this is the 
case, we call the basis holonomic, otherwise unholonomic. 


394 11 Four-Force 


R. We have: 
Ai Al 
Be Veoh AF 
el ae seas (11.138) 
Di a 
We compute: 
1 0 00 
zs 0 ; 00 
R=] e(Aye(B)b ba 0 (11.139) 
c b2c be 
A MNG& 
where: 
A)e(B)(éb — b 
igo (11.140) 
cd 
a 6(A)e(B)cb* + e(B)e(C)c3b? — bbe — bé) + 2ch? 
M =(R)a2 = 
c2b3d 
: (11.141) 
_ —¢b — 2be 
N=(R")a= an (11.142) 
a= hese (11.143) 
nak 4 bed’ : 


Exercise 11.7.3 Verify that the matrix R7' is the inverse of the matrix R. Then 
show that: 


Ai =u 
Bi = li 
b 
ic = (Ae) cid b ig | ii 
Cc b2c bc 


Di = Avi + Mii + Nii + Bui’. 


Consider now an arbitrary four-vector which in the Frenet — Serret basis has the 
following analysis: 


ri =aA'+BBi+yC'+6D'. (11.144) 


11.7 The Frenet-Serret Frame in Minkowski Space 395 


Replacing A’, B', C’, D' in terms of the vectors of the physical basis we find: 


i e(A)e(B)b __&(A)e(B)(éb — bc) 
rs|aty - 5 


Ai+ 
c2d 


1 b 6(A)e(B)cb* + e(B)e(C)c2b? — b(bc — be) + 2ch?] ., 
F) Ai 
+ E b Re Shed - 
1 éb +2bc) «; ‘ae 
he 11.145 
= E be Bcd ae er en 


Equation (11.145) is the answer to our problem, that is, gives the generic 
form of an arbitrary four-vector in terms of the tangent vector to the curve and 
its derivatives. If we denote the curvatures b,c,d as kK, K2, «3 then expression 
(11.145) reads: 


ee c yp SAve(B)er _ se(A)e(B) (kK — fu] a 


K K3K3 
: [ ar i 4 sf Me B wae! + BeOaM + ei (kak — Rika) + matt) oe 
K] Ky K2 K5 Ky, K3 
I . dK a. ee 
paca onl | Bane, Ai (11.146) 
K1K2 K12K22K3 K{K2K3 
We may write this expression as 

Ai 

ji 
r'=Q (11.147) 

Py 


where the matrix Q is 


K1 4 g kaki t2k1K2 
Y K2 + K3K3 


D) ae oa +2 
B 1 y Pal 5 KK K3Ke HK] (K2K 1K K2)+2K 2K} 
KI KK? K3KPK3 


Q= . (11.148) 


1 Ky 5 K2KI tk Ke 
Ky ( Yo + KFK3 


a ee 
K{K2K3 


396 11 Four-Force 


Exercise 11.7.4 Show that in the Frenet — Serret basis the generic vector r' is 


A! 
Bi 
r'=QR ci (11.149) 
Di 
where the product matrix QR is given by the following expression 
Ky Koki +2K1K2 2 1 KoK1+2K1K2 Ky 
Yin +9 K3K3 ey (y K1K2 8 K12k22K3 ) 7 38 OG 
P 4 3.2 - na 2 
1 Ky KK KZ KT AK] Kok] —K 1 K2)+2k2K | 
KI (62 Yee on ) 
: 1 2K +2K1 K2 3 2 1 
QR=| +k (y ae gambia ) + («1 +.«13 — «1 K3) (5-45) 
K1K2 (y ae p kam bore ) + (kik2 + 2k2K 1) baa 
5 
(11.150) 


11.7.2. The Generic Inertial Four-Force 


We are now ready to compute the generic expression of a pure four-force which 
modulates the motion of a ReMaP P. Let c! be the worldline of P which we assume 
to be such that «;k2«3 4 0 and let uw! be the four-velocity of P. The physical frame 
along c! consists of the four-vectors: 


Ai =u', Ai=ia! Ai = ii! A = 


A pure (or inertial) four-force F' on P is defined by the condition F/u; = 0. The 
general expression (11.146) reads for ¢(A) = —1, e(B) = e(C) = +1: 


K KK] —KiK2] ; 
r= |. y—+6 5 \e 


K2 K5K3 


Vo 3,2 


1 Ky —KoKk14 + K3KP + KL (kak) — R1k2) + 2kok? |; 
+18 Uu 
K] Ky K2 Kx Ky K3 


11.7 The Frenet-Serret Frame in Minkowski Space 397 


1 pe | 


Zend 


KiK2 K1~K2°K3 


(11.151) 


K1K2K3 
Condition ri'u; = 0 implies a = O (due to (11.144)), therefore, the generic 


expression of a pure (or inertial) four-force acting on P (and in general of a spacelike 
vector normal to the four-velocity u'!) is: 


v-| yi 9 | 


K2 KZK3 


1 ral KK 14 + KZ KT + KL (K2k1 — Ry K2) + 2kokt |; 
Pie 8 2 ie 
KI Kj k2 Kj Ky, K3 
l S Qk . 1 ad 
shoe + 2K) K2 ii +8 a (11.152) 
K1K2 K17K2°k3 KIK2K3 


Note that this expression contains u' although F‘u; = 0! This is due to the fact 
that the basis {ul u', ii, i} is not orthonormal. For example: 


wii; = (uli) — wa; = —W = —b* £0. (11.153) 


Also note that the third component equals -—4%x first component. Therefore 
1 
essentially we have three conditions for any specification of the 4-force r’. 


Exercise 11.7.5 Prove the relations: 


iiuj = —K17 (11.154) 
ii; = Ky Ky (11.155) 
t= SSK: (11.156) 


Hint: Use (11.136) to write ii! = b2 A’ + bB! + bcCi. 


11.7.3 Applications 


During the many years of Special Relativity there have been proposed many types 
of pure four-forces. Each of these satisfies a physical “need” and follows from 
some physical considerations or inspirations. All the proposed four-forces must 
follow from the generic expression (11.152) for an appropriate set of values of the 


398 11 Four-Force 


parameters 6, y, 6, otherwise they are exempted mathematically. In the following, 
we shall examine a few types of pure four-forces proposed over the years and will 
decide on their geometric acceptance or rejection assuming that k1K2K3 # 0. 


11.7.3.1. Newtonian Type Four-Force 


This force has the general expression F' = mu’ and is the generalization of 
Newton’s Second Law in Special Relativity. To prove that this type of force is 
acceptable, we have to prove that it follows from the generic expression (11.152) 
for a special set of values of the parameters 8, y, 5. For this, we examine if the 
following system of equations has a unique solution: 


K| 3 kok + 2K1K2 0 


| rma 5) = 
K2 K5K3 
. _ 4 3 2: . . _ ee 2 2 
1 KI KK." + ka Ky + 1 (K2K1 — K1K2) + 2k2K; 
B pF 6 ae =m 
KI KyK2 KK) K3 
1 g kaki t 2k1K2 =( 
K\K2 K17K27K3 
6=0 


The solution of the system is y = 6 = 0, 8 = kim where xk; is the first curvature 
(= the measure of the four-acceleration) of the worldline. We conclude that the 
proposed Newtonian four-force is geometrically acceptable. This does not mean 
of course that it is also physically acceptable. Only experiment can establish this. 


11.7.3.2 The Lorentz-Dirac Four-Force 
Dirac suggested that the four-force on a particle of mass m and charge g, which 


accelerates under the action of a 3-force is given by the following formula (Lorentz- 
Dirac force): 


: ‘ 2? ; ‘ : 
Fi=mu' — 30 lil — (wiaj)u'). (11.157) 


The second term is assumed to describe the self interaction of the charge with its 
own radiation field. 


11.7 The Frenet-Serret Frame in Minkowski Space 399 


We examine if this type of force is geometrically acceptable. For that we check 
if it is normal to the four-velocity. We find: 


. 2, g . 2 : 
Plu =— 3 lita — laulud) = — Fa" (wi + i) = 0 


therefore, F' is a pure four-force. We continue with the computation of the 
parameters 6, y, 5. We note that w'u; = re hence (11.157) can be written: 


2 2 sei 


ope aoe = (11.158) 
= ad fie aie: . 


Comparing (11.158) with the generic expression (11.152) we find the following 
algebraic system of four-equations in the three unknowns f, y, 6: 


ye 4 2 + 2k1K2 _ 249 


=<-K 
K2 Kid grt? 
1 ral KK} + KSKP Ky (Kok — R1K2) + 22k? 
B vate 23 = 
Ky KyTC Kjord 
1 g kaki t 2k1K2 _ 2 2 
vrek Reed ae 
6 =0. 
The solution of this system is: 
2; 
B = mk, — =kiq 
3 
y =< K1K29q" 
56=0. (11.159) 


We conclude that the Lorentz — Dirac four-force is geometrically acceptable and, 
furthermore in the Frenet — Serret basis can be written as: 


i 2 oki a. 2p 
F’={m 34 Eb KiB 3 K1K2g C. (11.160) 


Note that the case F' = 0 is impossible, because then: 


mk =0, KiK2g7=0—sm=q=0 _ because KK. £0. (11.161) 


400 11 Four-Force 


11.7.3.3. Bonnor Four-Force 


Bonnor”* assumed that the four-force on an accelerating charged particle depends 
(i) on the u', w' and (ii) the amount of radiated energy is measured by the change m 


of the mass of the particle. Then he gave the expression: 
i +i 5. Re Bees i 
Fo =mu' + ad (u'uj) | u'. (11.162) 
As we have seen i! = x17 hence this can be written: 


: ; 2 . 
Fi =mia' + G = si?) ul. (11.163) 


We demand the condition F/u; = 0 (pure four-force) and find form (11.163): 


. a) 4 
F'u; =-(n- =«1°¢?) =0¢6m= aie (11.164) 


Replacing this back in (11.162) follows F! = mi’, that is, the Newtonian type of 
force but with varying mass. We do not expect that the Newtonian type of force will 
be possible to describe the self force on the charge, therefore, it appears that the 
Bonnor four-force does not serve its purpose. Indeed, it was soon abandoned. 

The conclusion is: 


When Geometry is used properly it becomes a great invaluable tool in making good Physics! 


2See Bonnor WB (1974) “A new equation of motion for a radiating charged particle” Proc R Soc 
Lond A, 337, 591-598. Also Schild A (1960) “On the radiation emitted by an accelerated point 
charge” Jour Math Analysis and Applications 1, 127-131. 


Chapter 12 ®) 
Irreducible Decompositions al 


12.1 Decompositions 


In order to understand the concept of decomposition it is best to start with a well 
known example. Consider a two index tensor 7,, and write it in terms of the 
symmetric and antisymmetric parts as follows: 


1 1 
Tab = 3 Tan + Tha) + 3 Tan — Tha) = Trav) + Thad). (12.1) 


This breaking (=decomposition) of the tensor in two parts has the advantage that 
under a change of coordinates, each part transforms independently of the other. 
Hence, one is possible to study the behavior of the tensor under coordinate trans- 
formations by studying the behavior of each part independently, thus facilitating the 
study. Since equation (12.1) is an identity no information is lost. Behind this obvious 
remark there is a general conclusion concerning the behavior of a tensor under 
coordinate transformations which has as follows. The coordinate transformations 
of a space (i.e. all differentiable coordinate transformations) form a group known as 
the Manifold Mapping Group (MMG). As we have said, a group of transformations 
defines a type of tensors i.e. the geometric objects which transform covariantly under 
this group. The tensors defined by the MMG are the most general ones defined on a 
space, because the MMG requires no other structure on the space but the manifold 
structure itself, which is inherent in the definition of the space. The decomposition 
of the tensor 7, we considered in symmetric and antisymmetric parts is covariant 
wrt the MMG group in the sense that (a) Each part is a tensor of the same type as 
Tap and (b) Each part transforms independently of the other under the action of the 
MMG (i.e. under coordinate transformations). 

What has been said for the MMG applies to any other group of transformations. 
In particular, let us consider the Lorentz group which is the fundamental group of 
Special Relativity. The tensors defined by this group are the Lorentz tensors. A 


© Springer Nature Switzerland AG 2019 401 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_12 


402 12 Irreducible Decompositions 


covariant decomposition of a Lorentz tensor in irreducible parts is the decompo- 
sition (via an identity) of this tensor in a sum of Lorentz tensors each with fewer 
components, which transform independently of each other under the action of the 
Lorentz group. In the present chapter, we discuss the irreducible decomposition 
of vectors and tensors wrt a general vector and wrt a pair of vectors. We apply 
the results in Minkowski space and produce the well known and important 1+3 
and 1+1+2 decompositions. These decompositions are usually hidden behind the 
discussions in Special Relativity, however they have to be considered explicitly 
when one considers the energy momentum tensor, the relativistic fluids and other 
relevant material. 


12.1.1 Writing a Tensor of Valence (0,2) as a Matrix 


The calculations involving tensors with two indices are simplified significantly if 
we write them as square matrices. In order to do that, we must follow a definite 
convention which will secure the validity of the results. For the vectors in a linear 
space V> (we use V? for economy of space, the same applies to any finite n) we have 
made the convention to represent the contravariant vectors with n x 1 (or column) 
matrices and the covariant vectors with 1 x n (or row) matrices. For the tensors with 
two covariant indices we make the following convention: 


The first index counts rows and the second counts columns. 


According to this convention we write: 


Ti Ti2 T13 
Tij = | To Too Tx3 
T31 T32 T33 


A special case of tensors with two indices are the products if two vectors A” B”. 
According to the definition of tensor product, the components of the tensor A” B’( 
in the coordinate system in which the components of the vectors A” and B” are 
given) it is found by multiplying each component of A“ with all the components 
of B” and setting the result as a row in the resulting matrix. For example, if we are 
given: 


then the tensor product: 


12.2 The Irreducible Decomposition wrt a Non-null Vector 403 


It is easy to show that the matrix which is defined by the product A” @ BY” is the 
transpose of the matrix corresponding to the product BY @ A”. 

Another case which we meet frequently in practice is the computation of the 
components of the tensor 7; A/ if we are given the components of the tensors 
Tij, Ai (in the same coordinate system!). According to the convention above this, is 
found as follows: 


Ti Ti2 T13 A! Ty A! + T12A? + Ty3A? 
Tj A’ = | To Too Tr3 A? | = | MA! + TxA? + M3 A? 
13, T32 733 AB T3,A! + T32A? + 733A? 
and: 
t 
Al Tir Ti2 T13 
Tj A' = | A* To, Tr 13 
A T31 T32 T33 
Ti1 Ti2 T13 
= (A!, A?, A?) | My Ty To; 
T31 T32 133 


= Gas + A?T) + A?T31, A!Ti2 + A? Tx + ae) 
- VAT 3 + A2D3 + A273 ; 

In order to use a more compact notation we represent the tensor 7;; with the 
matric [T] and the vector A’ with the matrix [A] and compute the tensor 7;; A/ from 
the product of matrices [T][A] and the tensor T; jA! from the product of matrices 
[A]'[T]. 

In the following, we shall apply extensively the above conventions. However, it 
is advised that before one accepts a result as final, it is advisable to check it at least 
partially using the standard analytical method of components. Of course the best is 
to do the calculations by means of an algebraic computer program. 


12.2 The Irreducible Decomposition wrt a Non-null Vector 


With every non-null vector we associate a unique projection operator which projects 
in a plane normal to the vector. This operator can be defined in any metrical space 
(positive definite or not) and especially in the Euclidean space. In this section, we 
shall study this projective tensor and we shall define a covariant decomposition of 
any tensor in irreducible parts. This decomposition we call the 1 + decomposition 
and it is used extensively in relativistic Physics (however not in Newtonian Physics). 


404 12 Irreducible Decompositions 


In order to emphasize the concept and the generality of the projection operator 
in the following we shall consider initially a Euclidean space of dimension n(> 2) 
and subsequently we shall consider the 1+3 decomposition in Minkowski space. 


12.2.1 Decomposition in a Euclidean Space E" 
12.2.1.1_ Decomposition of a Euclidean Vector 


Consider a Euclidean space of dimension n(> 2) endowed with a metric gg, ». In 
an ECF the components of the metric are 4,,,. Let A” be a vector in space whose 
length A? = g EuvA" A” > 0. We define the projective tensor h,,, associated with 
the vector A” as follows: 


hyy(A) = _ FT gegy (12.2) 
pv = SEuv re) . : 


It is easy to prove the following identities: 


hyv(A) = hyp (A) (symmetric tensor) (12.3) 
hyy(A)d"” =h¥(A) =n—1 (trace) (12.4) 
hyy(A)A” =0 (projects normal to the vector A“) (12.5) 


Exercise 12.2.1 Show that the components of the projection tensor h,,(A) in a 
ECF II and for an arbitrary vector A" are diag(1 — (Al), wy le 7 (A")*). 
Furthermore show that if A“ is unit (i.e. A? = 1) then hyv(A) = buy — AMA”. 


Using h,y(A) we can decompose any other vector B“ along and normal to the 
vector A” as follows: 


1 1 
BY = bE BY = (he + qe Av) Be = hh BY + Az AvBYA®. (12.6) 
We call the vector B, = hi BY the normal component of B” wrt A” and the vector 


a7 (AyB’) AM the parallel component of B” wrt A”. In the following, we assume 
that we work in a ECF; therefore, the components of the metric are by. 


Example 12.2.1 Decompose the vector B’ = | 1 | normal and parallel to the 


1 
vector A“ = | 2 
0 


12.2 The Irreducible Decomposition wrt a Non-null Vector 405 


Solution 
We compute |A|? = 5. Hence: 


1 100 1 120 1 4 -20 
ApAy=]010}-—2]),240]=—2] -—2 10 


h(A) uv = buy — AZ 
001 000 0 05 


(Check that h(A),,,A” = 0!). Then for the normal and the parallel parts of B’ we 
have: 


1 7 1 3 
BY =h(A)yB" = 5 = , Bi = BK Bi == : 


(Check that BY = Bt + BY). 


12.2.1.2 Decomposition of a Euclidean Second Order Tensor 


The 1+3 decomposition of an arbitrary tensor 7,,, in a Euclidean space of dimension 
n wrt the vector A” of (Euclidean) length A* is done by means of the following 
identity 


I 1 
Fig 8 oo Loe (1,2 + ee) (1! + area?) Top 


1 1 I 
= Foam + ails AP Ay + qa At Ay + nth? Toop 


1 1 1 
=a (Tap A® A?) Ay Ay + al AP Tap Ay + qa A’ Tap Ay + hh, Top. 
(12.7) 
We note that the irreducible parts of the tensor 7;,, are the following tensors: 
¢ An invariant: a7 Top A® AP 


A 
e A tensor with two indices but with components only in the space which is normal 


to the vector A”: hthy Top - 


¢ Two vectors normal to A“: aah @AP Top and nh AX Tap 


The above result we write conveniently in the form of the following block matrix: 


arTapACA® ShHO AP Top 
(12.8) 
qhyP A°Tap h, thy? Tap 


406 12 Irreducible Decompositions 


where the blocks (1,2) and (2,1) are matrices of order n x 1 and | x n respectively 
and the block (2,2) is a square matrix of dimension (n — 1) x (n — 1). The 1+ (1 — 
1) decomposition (12.8) is used extensively in the study of the energy momentum 
tensor and the kinematics of Special Relativity. Strange enough it appears that it is 
not used in Newtonian Physics! 


101 
Example 12.2.2 Decompose the tensor of order (0,2) Typ = | 1 2 1 | wrt the 
001 
1 
Euclidean vector AY = | 2 
0 


Solution 
In Example 12.2.1 we computed the projection operator: 


4-20 
h(A)uy = z{-2 10 
0 05 


We are ready to compute the irreducible parts of the decomposition as given 
in (12.7). Since the purpose of the example is to present practices of calculation 
we shall follow the intermediate steps and show how they are computed using 
matrices. We denote by [i], [T], [A] the matrices which correspond to the tensors 
hop, Top, AY and give the answer as a product of matrices. We emphasize that 
the calculation is done in a Euclidean space. In a Lorentz space, we have to take 
into account the signs which can be positive or negative. We also note that the 
components of the metric are 6,,, therefore when we raise or lower indices we 
observe no change in the coordinates. For example, we have [hag] = [no] — [n°]. 
This is not the case with the Lorentz metric. 

For the invariant part we have: 


1 ean 1 ; 
qr (Tap A° AP) = <GLAV ITI] 


11 
= 35° 


12.2 The Irreducible Decomposition wrt a Non-null Vector 407 


The second irreducible part is a 3 x 1 matrix. We recall that the lower indices (covari- 
ant) count rows and the upper indices (contravariant) count columns. We have: 


1 1 
qt AP Tap = qi et APTip + h¥? APT + hi! AP Tg) 
1 A® Tig 
= qa (Mut hur hus ) | AP Top 
AP T3p 
1 
= qo lAITIA] 
I 4-2 0 101 1 1 —6 
= —|-2 10 121 2),= =| 3 
a 0 05 001 0 si 0 
Similarly, for the 1 x 3 irreducible part we have: 
! B yo 1 1,48 2,48 3 4B 
ree A® Top = eae A°Tp1 +h, A® Tp2 +h, A® Tp3) 
1 
: B B B i : t 
= 7 (APT 1 APT p2 APT ps) | hy? | = qGlAT ITI] 
h 3 
101 4 —20 
1 
= 55 (120) 124 2 10 
001 0 05 
Se (4-215 ) 
Sr 
Finally, for the 3 x 3 irreducible part we compute: 
I 4 -20 101 4 -20 
A°h’ Tap = [A][T][A] = 75 —2 10 121 —2 10 
0 05 001 0 05 
1 16 —8 10 
=z] -8 4-5 
2 0 O 25 


It would be a good exercise to check that the above results are correct. To do this 
one must add the computed irreducible parts and get the original tensor. Let us do 


408 12 Irreducible Decompositions 


this. We have!: 


120 

AyAy =(120 )@(120 )=| 240 

000 
( 1 { (-6 =i2.0 

— ho APT, ) Av=35(-630 )@(120 )= x 3 6 
2 ap } “Av 

25 ON G6 
1 1 4-2 15 
(a'utA The) Av = 35 (120 )@(4-215 )=— = : 


It is easy to show that the sum of these three matrices and the matrix 


’T = 
h,*h,PTap = 52 | -8 4 —5 


give indeed the original tensor T,,,. 


12.2.2. 1+3 Decomposition in Minkowski Space 


The main difference between the decomposition of a tensor in Minkowski space and 
in Euclidean space is that in the former when we change an index from covariant 
(lower index) to contravariant (upper index) in the computations we have to change 
the sign of the zeroth component. Furthermore, in Minkowski space we have vectors 
with zero length for which the projection operator is not defined. Finally, for the 
timelike vectors A’A; < 0 therefore we have to replace A‘ A; with sign(A)A7 
where A* > 0(A € R) and si gn(A) = —1,+1 for timelike and spacelike four- 
vectors respectively. 

The first application of the 1+3 decomposition in Minkowski space is in the 
kinematics. For this reason, we consider the 1+3 decomposition wrt the four- 
velocity, that is a unit timelike vector? u“ (u'u; = —1). In this case, relations (12.6) 


'Note that we have lowered the index j in the term ah ae Tyg therefore we take the transpose 
matrix of ar ht AP Tug i.e. the 5 (-6 30 ) ; 
2 


2We consider c = 1. Otherwise we have u“ug = —c?. 


12.2 The Irreducible Decomposition wrt a Non-null Vector 409 


and (12.7) we computed for the Euclidean metric apply and taking into account that 
sign(u) < Owe write*: 


¢ Projection operator: 
h(Wab = Nab + Ugly (12.9) 
¢ 1+3 decomposition of a four-vector: 
we = —(wpu?)u? + h(u)*,u?. (12.10) 


Decomposition of a tensor of type (0, 2): 


Tab = (Teau’u!) watty — hw )q°U" Tea — h(u),uTeatla + hW)q WW) y! Tea: 
(12.11) 


We note that these formulae coincide with those of the last section when we 
replace A? with —1. However the calculation of the components is another story. 
This will become clear from the following example which we advise the reader to 
follow through step by step. 


Example 12.2.3 1+3 decompose the four-vector 


and the Lorentz tensor: 


Tab = 


AE 


wrt the four vector u* = . It is assumed that the components of all tensors 


1 
1 
0 


are in the same LCF system. 


3Relations (12.10) and (12.11) can be proved directly by using the identity T,, = na Ny? Tea and 
then replace nap = h(u)ap + Uap. 


12 Irreducible Decompositions 


410 


Solution 
Since we work in a LCF system the Lorentz metric has components 7 


diag(—1, 1, 1, 1). We compute*: 


3 -V3 -V3 0 
-J3 1 1 0 
Ua @ up = (—V3, 1,1, 1) @ (-V3, 1,1, 1) = ea 4. 6 
0 0 0 O 
V3 
Ugu? = (—/3. ,1,1,1) ; -1 (Unit timelike four-vector). 
0 


h(w)ab = Nab + Ua 


3 -V3 —-Vv3 0 
-V3 1 1 0 
= diag(—1,1, 1,1 
diag( Begs eg )+ —J3 1 1 0 
0 0 0 0 
2-73 -V/3 0 
_[-v3 2 10 
~|-v3 1 2 0 
0 0 0 1 
Concerning the raising and the lowering of the indices we have: 
—2 -V/3 —V/3 0 
in He _[v3 2 1 0 
hw, =n hW)ac = V3 1 °22°~«0 
0 0 O 1 
and 
—2 V3 V3 0 
-V/3 2 1 0 
h a _ ach, — 
(u) po (U)cb n/t 120 
0 0 01 


4Recall: First index row second index column! 
Also ug = (—V3, 1, 1, 1)! 


12.2 The Irreducible Decomposition wrt a Non-null Vector 411 


Note that [1(u) eo ]=[hw)? al’. We are ready to apply the 1+3 decomposition. 
For the four-vector w“ we have: 


J3 
wi, = uqaw* = (-V3, 1,1, 1) : =0 
0 
—2 V3 V3 0\ (v3 V3 
-J/3 2 10 2 2 
a as h a b —_ = 
it ae GGT D0, || it 1 
0 0 01 0 0 
J3 
We verify that w* = wi +wi = : 
0 
Similarly, for the tensor T,, we compute: 
T,putu® =7 
A(u)q Tea = [h(u) gM Tea lu" | 
—2 -V3 -/3 0\ /1001 V3 
[v3 2 1 O}f0101 1 
TagS A 7 Be NO Oh eT 1 
0 0 0 1/ \ooo!1 0 
— (6/3 -8 -100) 


A(u)4 Teau® = [u} [Tealh(u)4 4] 


412 12 Irreducible Decompositions 


O01 —2 V3 J/30 
0101 =F 2 10 
= (¥3,1,1,0) 0121 -J/3 1 20 
0001 0 0 01 
= (63, 9, 9, -/3-2) 
h(u)gh(U) 4 Ted = (a(w) g Mea lth(u) “1 
—2 -V3 -/3 0] /1001 —2 V3 730 
ePare. 2: + Sh 0 EO 2-0. ae 3 0 
al 2 ONTO 2 4/3 1. 2-0 
0 0 0 1}\0001 0 0 01 


16 8/3 —8/3 —2./3-2 
27/3.) i 10 V3 +3 
-9/3 13 14 J34+3 

0 0 0 1 


It is left as an exercise for the reader to verify that the above decomposition is 
correct. 


In case the four-vector A“ is not unit, the projection tensor becomes: 


Ae == add (A > 0) (12.12) 


and the formulae which give the 1+3 decomposition change accordingly. 


For example, the 1+3 decomposition of a four-vector B@ (null, timelike or 
spacelike) wrt the vector A“ is: 


&(A) 
BY = 60 B? = 2B? = (ho (A) + Atay »)B? = 


° 


—— (Ap B?)A® + he(A)B?. 
(12.13) 


12.2 The Irreducible Decomposition wrt a Non-null Vector 413 


For a general tensor 7,, of order (0,2) working in a similar manner we find?: 


1 €(A) &(A) 
Tob = 4G (Tea A° A“) AgAy ~~ 7 WA), “pA Ag ap MA) Tea A® Aa 


+ h(A),SR(A),¢ Tea. (12.14) 


We emphasize that relations (12.13) and (12.14) are mathematical identities, not 
new equations! 


3 
Example 12.2.4 Decompose the four-vector B¢ = : and the tensor Tj, = 
1 
1001 3 
0101 2 
0121 wrt the four-vector A“ = id: All components are assumed to be 
0001 0 
in the same LCF. 
Solution 
We compute A“ A, = —4, therefore the four-vector A“ is timelike with measure 
A = 2. The tensor product: 
9630 
6420 
A‘ @ AP = 
- 3210 
0000 
and the projection operator: 
5 =3. 53 0 
4 2 4 
—3 1 
oP 
h(A)ap = 
—3 1 5 
= = = 0 
4 4 
0 0 O01 
5The proof has as follows: 
; A A 
ip =eeT = (1: | Oaraa) (1 + Oats) iy 


(A) 
Az 


1 
= (Toa A‘ A‘) AgAp + (ng? AY ApTea + hy! A° AaTed ) + Ah Tea. 


414 12 Irreducible Decompositions 


For the four-vector B“ we have: 


BoA, = —4 

0 

0 

h(A)ZB? = 
(A); 0 
1 
therefore, B* = A% + h(A)¢B?. 
For the tensor 7,5 we compute: 

1 a,b 17 
Age 6 


1 : 39 21 «33 
sah Aa Teo AY = ( ) 


16° ~g” 16’ 


Pepi 
A2 a 16° 8’ 16° 2 


39 23 25 >) 
97 —57 —63 —7 

16 8 16 2 

—51 31 £29 
A(A)H(A)Tea =| 8 4 8 
—87 47 73 5 
16 8 16 2 


0 oO oO 1 


Verify the above result using relation (12.14). 


12.3. 1+1+2 Decomposition wrt a Pair of a Timelike 
Four- Vector and a Non-null Four- Vector 


In Sect. 12.2.1.2 we considered the 1 + 3 decomposition wrt a non-null vector. 
However, practice has shown that we have to consider in Minkowski space the 
decomposition wrt a pair of non-null non-collinear four-vectors.° Obviously, we 


Because Minkowski space is flat, it is possible to transport a four-vector from one point to another 
along any path. This implies that the four-vectors need not have a common point of application. 


12.3 1+1+2 Decomposition wrt a Pair of a Timelike Four-Vector and a Non-... 415 


have the following three possibilities (timelike, timelike), (timelike, spacelike), 
(spacelike, spacelike). In the present section, we discuss the first two cases. 

Let A“, B® be two non-null four-vectors with length A, B respectively and 
assume that A“ is timelike whereas B“ can be timelike or spacelike. We are looking 
for a symmetric tensor Pap of order (0, 2) (pab = Pba) which will project normal to 
both four-vectors. The general form of this tensor is: 


Pab(A, B) = nap + 41 Ag Ap + 42 Ba Bp + 43(Aa Bp + Ba Ap) (12.15) 


where a1, a2, a3 are coefficients which must be determined. It will be convenient 
to introduce the invariant: 


VY = —NapAB? = —A,B". (12.16) 
We shall determine the coefficients a}, a2, a3 from the requirement: 
Pav(A, B)A? = Pap(A, B)B? =0. (12.17) 
Requirement pyp(A, B)A? = 0 gives the equations: 
l-—aA* = a3y 
any = —a3A* (12.18) 
and requirement pgp(A, B)B? =0: 
1+.€(B)ayB* = ay 
ayy = a3e(B)B*. (12.19) 
The solution of the system of the four equations is: 


€(B)B? A? y 
= ,aQ= , B= ————————_. 
y2+e(B)A2B2’*—-y2+e(B)A2B2’ > 2 + €(B)A2B2 


a| 


Therefore, the projection operator pgp(A, B) is defined as follows: 


ace OE pe AT a= ig 
Pab\A, = Nab yp + €(B)A2B2 a4b v2 + €(B)A2B2 a Bb 
y 
+ Ti+ e(ByAZBe Aa Bb + Ba Ab). (12.20) 


Simply they must define a 2-plane. We shall use this observation in the derivation of the covariant 
Lorentz transformation. 


416 12  Irreducible Decompositions 


In case the four-vectors A“, B@ are unit and B®@ is timelike, we write A? = u*, B? = 
v@ where u“ug = v“vg = —1 and relation (12.20) becomes: 


1 
Pab(U, V) = Nab — yl [Ugly + Vavp — Y (Uap + Valp))]. (12.21) 


Exercise 12.3.1 Show that the tensor pap(A, B) satisfies the requirements (12.17). 
Furthermore, show that the trance: 


pi(A, B) =2. (12.22) 
and that’: 
py(A, B)heb = pav(A, B). (12.23) 
Exercise 12.3.2 Let A®% be a timelike four-vector and B® a non-null four-vector. 
Consider an arbitrary four-vector C* and decompose it wrt the four-vectors A“, B@ 
as follows: 
Ca = a4Aq + a5Ba + Par(A, B)C? (12.24) 


where aq, a5 are coefficients to be determined.’ Contracting with A%, B® show that: 


_ —€(B)B?(CA) _ A?(CB) -— y(CA) 
~ 2 -e(BYA2B?? “2 4 €(B)A2B? 


a4 


where (CA) = C“%Ag, (CB) = C“Bg. Infer that the 1+1+2 decomposition of the 
four-vector C“ wrt the pair of four-vectors A“, B@ is given by the identity: 


—e(B)B?(CA) — y(CB) A?(CB) — y(CA) 


C,= 
7 y2 + €(B)A2B2 “" y? + €(B)A2B? 


Ba + paw(A, BC’. 
(12.25) 


Finally, in case the four-vectors A“, B® are the timelike unit four-vectors u“, v% 
(€(B) = —1) show that (12.25) reduces to: 


_ cH yc) a: (Cv) — y(Cu) 


a Ey te + Papi vc’. (12.26) 


Ca 


’The proof is easy: p§(A, B)h® = pS(A, B) (52 + A-A”) = p®(A, B). 
8You can find the result by writing Cg = napC > and using (12.20) to replace nap. 


12.3 1+1+2 Decomposition wrt a Pair of a Timelike Four-Vector and a Non-... 417 


In Example 12.3.1, we derive the standard decomposition of a timelike four- 
vector wrt a pair of another two timelike four vectors. It would be an instructive 
exercise to derive the same result using the 1+1+2 decomposition. 


Example 12.3.1 Consider the timelike four-vector p', of length p> = —M?. 
Determine two timelike four-vectors p',, p5 such that: 


p' = pi + Ph (12.27) 
assuming that the lengths: 
Pi=—m> py =—m} (12.28) 
where M,m,,mz2 > 0 are given and that the following conditions are satisfied’: 
Pipi <0, Pipi <0. (12.29) 


Solution 
Let p},p5 be the required four-vectors. We consider the parallel projection: 


i Py Di i 
Pay = P 
l| —p? 


and the normal projection: 
i _ai i 
Pat = Pa Pal 


of the vectors pi (A = 1, 2) wrt the four-vector p'. Then equation (12.27) projected 
parallel and normal to p' gives: 


p= Py + Py (12.30) 


O= pi, + Phy. (12.31) 


Let ”’ a unit normal to the four-vector p'. We define the invariants A, j in terms 
of p}, with the relations: 


Pi = EN, Pi = Ap’. 


° The last requirement means that all three four-vectors have the same sign of their zero component, 
that is, they point in the same part of the light cone. 


418 12 Irreducible Decompositions 
From equations (12.30) and (12.31) we have: 

Phy =—uA py = (1A). 
We compute the lengths of pi. P in terms of A, ww. We find: 


—1?M? + uw? = —m? (12.32) 


—(1—A)?M? + pw? = —m?. (12.33) 
The solution of the system of equations (12.32) and (12.33) is: 


2, 2 2 
_M +my— mz 1, 


k= WTP =yuek (12.34) 
> [M? = (m+ m2)? )[M? = (my — m2)"] Deuce 2a 
i 2M aye 
(12.35) 


where re is the energy of ‘particle’ 2 in the proper frame of p! and yj is the length 
of the 3-momentum of | in the same frame. We note that we recover the results of 
the reaction p' > pj + p}. The invariants 1, jx satisfy certain restrictions. Indeed, 
from the inequalities (12.29) and because pe > 0, one has!?: 


M* + mi = ms > 0 
M? + mt — m3 < 2M? 
[M* — (my +m3)"J[M? — (my — m2)”] > 0. 
The first inequality gives: 
M>m,+m, M2>m,—m2 
or 
M<m,+m, M <m,—my). 
The second case gives: 


M? < mi — m5 => M?— mi +m <0 


'0The first two conditions mean that the energy is positive. The third is the restriction that the 
measure of the 3-momentum is non-negative. 


12.3 1+1+2 Decomposition wrt a Pair of a Timelike Four-Vector and a Non-... 419 


and contradicts the second inequality. The first case gives the condition M > m, + 
mz which is compatible with the other two inequalities. Therefore, the condition is: 


M>m,+my2. (12.36) 


as expected. 

Conversely, we note that if M > m, + m2 and n' is a unit normal four-vector to 
p', then the equations give a solution to the problem. Therefore, condition (12.36) 
is sufficient for the solution of the problem and the general solution is given from 
equations (12.34) and (12.35). We note that the general solution is determined 
completely in terms of the data of the problem modulo the spacelike unit four-vector 
n'. Therefore we have oo? solutions. 


Exercise 12.3.3. Let A% be a unit timelike four-vector (A° Ag = —1) and B“ bea 
spacelike unit four-vector (B“ Bg = 1) and let y = A“ Bg be their inner product. 
Define the quantity A = 1+? and prove that the tensor of order (0, 2) pap(A, B) 
has the following form 


PavlA, B) = Nab + (AgAp — BaBp+ y (Ag Bp a BaAp)) . (12.37) 


1 
youl 
Then prove that it has the following properties: 


1. It is symmetric 
2. Projects normal to both A“, B“, that is: 


Pab(A, B)A” = pap(A, B)B? = 0. (12.38) 
3. Its trace: 
pa(A, B) =2 (12.39) 
4. 
Pab(A, B)hb(A) = pac(A, B). (12.40) 
pS(A, Byh? = p(A, B). (12.41) 


We call the projection tensor Pap(A, B) the screen projection operator 
associated with the four-vectors A, B. This tensor is used in the study of spacelike 
congruences e.g. the field lines of the electric filed. Note that the above expression 
is fully covariant. 


Chapter 13 Mm) 
The Electromagnetic Field al 


13.1 Introduction 


The Theory of Special Relativity (and consequently the Theory of General Relativ- 
ity) would have never be discovered if Maxwell had not formulated the theory of 
Electrodynamics. 

Before Maxwell, the electric and the magnetic field were considered to be 
independent physical fields and as such had been studied for many years by a 
number of pioneer physicists who discovered many laws concerning the Physics of 
these fields. Maxwell was the first to foresee the common origin of these fields and 
introduced the electromagnetic field as the underlying physical entity. Subsequently 
he stated the basic equations governing the evolution of the electromagnetic field 
and reproduced all the then known physical laws concerning the electric and the 
magnetic field. 

However Maxwell’s field equations had a stoning property: They were not 
covariant under the classical Galileo transformations,! that is, their form was 
dependent on the particular Newtonian Inertial frame they were written! 

The non-covariance of Maxwell equations wrt the Galileo group of transforma- 
tions was a serious defect. Indeed according to the covariance Principle the physical 
quantities as well as the physical laws of a theory of Physics must be covariant 
under the fundamental group of transformations of the theory. Therefore the non- 
covariance of the electromagnetic field wrt the group of Galileo transformations 
meant that the electromagnetic field was not a Newtonian physical quantity! But 
then the electric field and the magnetic field are not Newtonian physical quantities, 
a fact that one could hardly understand and accept easily. 


'That is, 3-space rotation and translation or equivalently rigid body motion. 


© Springer Nature Switzerland AG 2019 421 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_13 


422 13 The Electromagnetic Field 


Furthermore it was found that these equations were covariant under another more 
general group of transformations, which today we call the Lorentz group. But at the 
time there was not a theory of Physics covariant wrt the Lorentz group, therefore 
the situation appeared to be impossible! 

In addition to this theoretical — but important — aspect came a number of 
experiments involving light, which could not be explained in terms of Newtonian 
Physics. These experiments were performed mainly by Michelson and Morley and 
were indicating that the speed of light in vacuum was independent of the speed of 
the source and the speed of the receiver and in fact was a universal constant. 

As it is the case with such revolutionary and “out of the current line” situations, 
experts reacted with “wisdom” and tried to find “generalizations” or “hidden” fields 
in the well established Newtonian Theory, which could explain the peculiar behavior 
of light. They invented a new “substance” to replace the absolute character of space 
and time, the ether,” whose properties were postulated to be just enough to explain 
the new experimental facts. However at that very moment the Theory of Special 
Relativity had been born and it was a matter of time who will present it first. 

Again as it is the case with great ideas the development was slow and gradual. An 
eminent physicist of the time, H. Lorentz (see footnote | Sect. 4.1) who was working 
on the effects of the electromagnetic field formulated a theory for the electron 
and introduced (in an artificial way) for the first time the Lorentz contraction. 
Another great theoretician H. Poincaré discussed the theoretical aspects of relative 
motion and practically stated what we call today the Einstein Relativity Principle. 
However the birth of the new theory required a young and very special man, not 
yet established therefore able to think outside the current trend. That was Albert 
Einstein, who starting from the invariance of Maxwell equations derived the Lorentz 
transformations and rewrote Maxwell equations in their intrinsic four dimensional 
form. Then taking a step further he postulated the Einstein Principle of Relativity 
(see footnote | Sect. 4.1) claiming that in addition to the electromagnetic field there 
are many more physical fields which are covariant wrt the Lorentz transformation. 
These fields are the physical quantities of a new theory of Physics and together with 
the laws which govern their behavior constitute what we call today the Theory of 
Special Relativity. 

In the following sections of this chapter we discuss the relativistic form of 
Maxwell equations and we solve some well known problems which demonstrate 
the application of the four dimensional formalism in standard electrodynamics. 


They did not call it dark matter. 


3Sometimes it is stated erroneously that this Principle concerns all physical phenomena — see 
physical quantities. It does not. All Newtonian physical quantities do not obey Einstein’s Relativity 
Principle whereas they do the Galileo Relativity Principle. For this reason a Newtonian physical 
quantity (velocity for example) is not a relativistic physical quantity etc. Every theory of Physics 
has a separate domain of application which comprises a “subset” of physical phenomena. The 
theory of “everything” is part of the human utopia. See Chap. 2 for details. 


13.2 Maxwell Equations in Newtonian Physics 423 
13.2. Maxwell Equations in Newtonian Physics 


The theory of the electromagnetic field developed by Maxwell is a macroscopic 
theory, that is, it does not enter into the structure of matter as we understand it 
today. According to that theory the electromagnetic field in an arbitrary medium is 
described by means of four-vector fields: The electric field E, the magnetic field H, 
the electric induction D and the magnetic induction B. 

As we have remarked in the introduction Maxwell’s equations are not covariant 
under the Galileo group, therefore their form depends on the Newtonian inertial 
system, K say, we write them. If in K there are all four fields E,H,D,B then Maxwell 
equations for the electromagnetic field (in K only!) in SI units* are the following: 


OB 
vVxE+—=0 (13.1) 
ot 
VxH en j (13.2) 
x -—_—_— = a 
ot J 
dap 
Vj+— =o. 13.3 
Ita, (13.3) 


In these equations j is the electric current density in K, which is defined by the 
relation> j = pv, where p, v is the density and the velocity of the electric charge in 
the frame K. Equation (13.3) expresses the conservation of electric current and it is 
called the continuity equation(for the electric charge). 


In empty space these equations are known with various names. The first is known 
aD 


as Faraday’s Law and the second in the static case (that is when 5- = 0) as 
Ampére’s Law. 
An immediate consequence of (13.1)° is: 
V-B=0. (13.4) 


This equation implies that there is no magnetic charge therefore magnetic current. 


“Electromagnetism is an old subject with applications in the most diverse areas of science and 
engineering. As a result there is a number of units in terms of which Maxwell equations have been 
written. In this book we shall use the SI system. In Sect. 13.17 we show how one writes these 
equations in other systems and especially in the Gauss system. 

5The current j is more general than this but for the time being the conduction current j = pv 
suffices. 

©To be precise equation (13.1) implies that V - B = constant in K. The requirement that the value 
of this constant is 0 is an extra assumption whose discussion is outside the scope of this book. For 
this reason this equation is considered as an extra independent equation compatible with the rest 
of Maxwell equations. 


424 13 The Electromagnetic Field 


From equations (13.2) and (13.3) follows: 
V:-D=o. (13.5) 


Equation (13.5) (or one of its versions) is known as Gauss Law. 

Equations (13.1), (13.2), (13.4) and (13.5) constitute Maxwell equations for 
a general electromagnetic field in a general medium. In practice we consider 
special cases depending on the particular physical problem. In these cases Maxwell 
equations take a simplified form. A standard method to obtain such simplified forms 
is by considering relations among the fields D, B and E, H known as constitutive 
relations. The first such simplifying assumption we do is to assume that the medium 
is homogenous and isotropic (that is invariant under the Galileo group). For these 
media we consider that the constitutive relations are: 


D=cE B=uH. (13.6) 


The quantities ¢ dielectric constant and . magnetic permeability are charac- 
teristic quantities of a medium and satisfy the relation: 


ep = 1/u? (13.7) 


where u is the speed of the electromagnetic field in the medium.’ Because the 
electromagnetic field propagates in empty space we must consider the empty space 
as a medium. In Special Relativity (but not in General Relativity!) empty 3-space is 
considered to be a homogeneous and isotropic medium with dielectric constant ¢9 
and magnetic permeability jz9.° Because in empty space the electromagnetic field 
propagates with speed c equation (13.7) implies the relation: 


Eouo = 1/c?. (13.9) 


We note that equation (13.9) relates the constancy of the speed of light with the 
electromagnetic properties of empty space. 

For a homogeneous but not isotropic medium the fields D and H are assumed to 
be related with the fields E and B by the relations: 


D=cE+P B= u(H+M) (13.10) 


7The quantities ¢, j2 are scalar only for homogeneous and isotropic media. For anisotropic and 
non-homogeneous materials these quantities are described by second order symmetric tensors. 


8The values assigned to these constants are the following: 


€9 = 8, 85 x 107! Farad/meter, (490 = 1, 26 x 10~° Henry/meter. (13.8) 


13.3 The Electromagnetic Potential 425 


where we have introduced two new vector fields, the Polarization Vector P and the 
Magnetization Vector M to count for the anisotropy. Obviously in empty space 
P=M=0. 

The first question to pose in the relativistic study of the electromagnetic field 
is if there exists a “generalized” potential in K (which cannot be a scalar) which 
with proper derivations, produces both the electric and the magnetic field in K. This 
question we address in the next section. 


13.3. The Electromagnetic Potential 


Equation (13.4) implies that (provided certain mathematical assumptions are ful- 
filled, which we assume to be the case) there exists a differentiable vector field 
A(r, t) such that: 

B=VxA. (13.11) 


Replacing in equation (13.1) we find: 
JA 
V xX [E+ =| = 0. (13.12) 


In the derivation of equation (13.12) we have used the fact that the operators 2. and 
V commute”: 


From a well known proposition of classical vector calculus equation (13.12) implies 
that there exists a scalar function ¢(r, t) such that: 


E+ an =-V¢ 
or 
hence: 
E=-V¢ i (13.13) 
7 or ; 


The functions ¢ and A are the potentials for the electric and the magnetic field, 
because they produce these fields by proper derivations. 


°The operators V and # do not commute, because the operator £ includes variations in space. 


426 13 The Electromagnetic Field 


Up to this point the functions @ and A are general. To determine them we 
are using the remaining two Maxwell equations. In order to make things simpler 
(without restricting seriously the generality, since the same hold in the case of 
a homogeneous and isotropic medium) we consider Maxwell equations in empty 
space!” and write the constitutive equations: 


D= EoE B= oH. 


If we replace E and B in terms of ¢ and A in equations (13.2) and (13.5) follows: 


3 
EE eee ee ee (13.14) 
ot E€0 
VR eRe Ase (13.15) 
@ at 


Equation (13.14) does not simplify further. Equation (13.15) can be written 
differently. Using the identity: 


Vx(VxA)=V(V-A)—-V7A 
and the fact that V ao a a V@ after a simple calculation we find the equation: 


la 
v-A+ =| = — p09}. (13.16) 


bo ag 
c2 Ot 


= 10°A 
c2 at? 
Equations (13.14) and (13.16) are the dynamical equations determining the 
potentials @ and A. 
We note that these equations do not determine the potentials ¢, A completely. 
Indeed if we replace A with A’ = A+ VA where A(r, ft) is an arbitrary (but 


differentiable) function, then: 
VxA’=VxA=B 


because V x (VA) = 0 for every function A(r, f). 

Therefore the transformation A’ = A + VA leaves the magnetic field invariant. 
However this is not so for the electric field, which under the transformation A’ = 
A+ VA transforms as follows: 


pave 28 = vp 2M 4 0 (4) 
~ or at ar 

JA 0A’ 

ot ar 


10Tn order to write Maxwell equations for a homogeneous and isotropic medium it is enough to 
replace c* with the product ey or with the speed u? of the electromagnetic field in the medium. 


13.3 The Electromagnetic Potential 427 


In order that the electric field will remain invariant we have to consider the 
following transformation of the scalar field ¢: 


,_ OA 
b> #=0-—. 


The transformation of the potentials, which leave the dynamical fields E, B 
invariant we call a gauge transformation. This type of transformations is very 
important in Physics.!! 

The indeterminacy in the determination of the potentials gives us the freedom to 
define the function A by means of a scalar relation among the potentials ¢, A and 
select the pair of potentials which is most convenient to our purposes. The basic 
criterion for this relation is that it must be tensorial, that is covariant under the 
fundamental group of the theory. Otherwise the relation will be frame dependent. It 
is evident that in the theory of Special Relativity this condition must be covariant 
under the Lorentz group, whereas in Newtonian Physics under the Galileo group. 
A second criterion, however of less importance, is that the condition which will be 
considered must lead to potential functions which simplify equations (13.14) and 
(13.16). 

In the Theory of Special Relativity the condition which fulfils both requirements 
is the Lorentz gauge, which for empty space is defined as follows: 


1 
V-A+ 5 — =0. (13.17) 


We note that with the Lorentz gauge the field equations (in empty space!) 
become: 


1 a p p 

2 
SE le = 13.1 

2 op a or Od = (13.18) 

107A . : 
2A — 25a = —poj or OA = —poj (13.19) 

where 
1 3 
—y2- aaa (13.20) 


is the D’Alembert operator.!* As we have seen in Exercise 4.2.1 D’Alembert’s 
operator is Lorentz covariant. Therefore with the use of the Lorentz gauge the 


'TA gauge transformation in a dynamical theory is a transformation of the dynamical fields such 
that the transformed fields satisfy the same dynamical equations as the original fields. 
Tt is interesting to read about the life and the work of D’ Alembert. The interested reader can visit 
the web site http://www-history.mcs.st-andrews.ac.uk/Biographies/D’ Alembert.html. 


428 13 The Electromagnetic Field 


equations of the electromagnetic potentials become similar and are written in 
Lorentz covariant form in terms of the D’ Alembert operator. This observation leads 
us in a natural way to the four potential and enables us to write Maxwell equations 
in four dimensional formalism. 

Another aspect of the Lorentz gauge is that it implies the equation of continuity. 
Indeed from equations (13.18) and (13.19) we have: 


0 
V(OA) = —poV -j, ar )==—_ => 


which proves our assertion. 

Because the dynamical equations for the potentials are covariant wrt the gauge 
transformation of the fields, we demand that the Lorentz gauge also will be invariant 
under the gauge transformation. This condition determines the gauge function 


A(t, t). Indeed the condition V - A’ + 4 a =p) gives: 


c2 ot 
1 ad’ 1a aA 
ry, =F tan (¢ ir) 
lag _» 1 aA 
=V:-A+5—+4+WA-s3=7 = 
Tie op c2 ar? 
c2 art? 


We conclude that the function A must be a solution of the wave equation in 
empty space. 

People have proposed gauge conditions for the potentials ¢, A other than the 
Lorentz condition. However they are not Lorentz covariant and/or they do not imply 
the continuity equation for the charge. For example in Newtonian Physics the gauge 
condition for the electromagnetic potentials is the Coulomb gauge, which is defined 
by the relation: 


v:A=0. (13.21) 


The name of this gauge is due to the fact that its use simplifies the field equation 
(13.18) to the following: 


eS (13.22) 


13.3 The Electromagnetic Potential 429 


which is the well known Poisson equation whose solution is: 


p(r’, t) 


woe? (13.23) 


or, t)= 
Q 


where p(r’,t) is the charge density (in the Newtonian inertial system K where 
Maxwell equations are considered!). This relation means that the scalar potential 
is the instantaneous Coulomb potential due to the charge density p(r, t) distributed 
in the space Q (and measured in K). 

Collecting the above results we state that Maxwell equations in empty space in 
the Lorentz gauge are: 


a. For the fields: 


B=VxA (13.24) 
E=-V¢ ss (13.25) 
7 at : 
b. For the potentials: 
p22 (13.26) 
£0 
A=—Loj (13.27) 
c. Equation of continuity: 
0p 
WA 1 — = 0) 13.28 
j+a ( ) 
d. Lorentz gauge: 
loa 
V-A+ og) (13.29) 
c* Ot 


The main result which follows from the above equations is that if we solve 
the wave equation for scalar and vector waves, then we have solved completely 
Maxwell equations and we have determined the electromagnetic field. 


The difference between the Lorentz gauge and the Coulomb gauge is that the 
first is covariant wrt the Lorentz group whereas the second wrt the Euclidean group. 
This becomes clear from the following example. 


430 13 The Electromagnetic Field 


Example 13.3.1 


la 
a. In the LCF & consider the four-quantity ( . a ) and show that it defines a 


four-vector. Show that the length of this four-vector is the d’Alembert operator 

=—Vv*- 5 a and conclude that the d’ Alembert operator is covariant wrt the 
Lorentz group. 

b. Show that in a LCF ©’ moving with parallel axes wrt the LCF © with velocity 


u, the V transforms as follows: 


(13.30) 


-V a 
vavtly ie 4 t |u 


uz c2 Ot 
Solution 


a. We consider a LCF ©’ moving in the standard configuration relative to the LCF 
= with parameter 8. The Lorentz transformation relating &’ and © is: 


x= yx’ + Bet’), ct= y(et’ + Bx’), y= y’, z=Zz. 


We have then: 


0 9 dx da dy 8 a 0 a(ct) _ ne a 
Ox’ ax ox’ dy dx’ dzax’ acct) ox’ ax” aCH 
a 9a 0 9 
dy’ ay’ az’ az 
a a ax a ay a az a a(ct) F a 
= a at =y a a 2 fee 
A(ct’) ax act’) | dy A(ct’) | az A(ct’) | (ct) A(ct’) _” A(ct) ax 
Therefore: 
~ aap y —By 00 ~ am 
Ox? —By y 00 ax 
x | F (13.31) 
Ww 0 0 10 ws 
ae 0 0 01 a 


13.4 The Equation of Continuity 431 


where in the rhs the multiplication is matrix multiplication. This equation means 
that the quantity V; = Car v) (note the lowering of the index and the 


consequent change in the sign of the zeroth component!!*) is a four-vector. 
The length of this four-vector is: 


2 
Pop a2 = 
njV'Vi =Vo— 2a 
This quantity is invariant that is: 
i. 1 3? 
12 we 
V aya a2 972" (13.32) 


b. We consider the LCF ©’ whose axes are parallel to the ones of the LCF © and 
moves with velocity u. Then the general Lorentz transformation (1.52) which 
relates © and &’ gives for the 3-vector V: 


u-V 1 a 
vevily De tae god | 
u 


13.4 The Equation of Continuity 


The equations of continuity appear in all theories of Physics and express the 
conservation of a current. In the Theory of Special Relativity the equations of 
continuity have a special significance which we show in this section. 

In a LCF & we consider a Euclidian vector field A and a Euclidean invariant B 
which satisfy an equation of continuity, that is: 


v-A+—=0. (13.33) 


'3We recall at this point the rule for signs of the components when we lower and raise the index 
of a four-vector. To the contravariant four-vector A’ we correspond the covariant four-vector 
A; with the relation Aj; = nj; A/ where nij is the Lorentz metric. In case we have Lorentz 
orthonormal frames (which we assume to be the rule) the Lorentz metric has its canonical form 
nij = diag(—1, 1, 1, 1). Therefore in such frames (and only there!) the four-vector Ai = (4) 
corresponds the four-vector A; = (—B, A)», that is, the sign of the zeroth component changes 
and the matrix from column becomes row. If the frames is not Lorentz orthonormal then the above 
simple rule does not apply and one (a) has to compute the components of the Lorentz metric 
and then (b) multiply the resulting 4 x 4 matrix with the column (or row) matrix defined by the 
components of the four-vector in the same frame. 


432 13 The Electromagnetic Field 


This relation can be written as follows: 


-4,) (Be) _ 
( m).(2) <0 (13.34) 


where the dot indicates Lorentz inner product. This equation attains a physical 
meaning in Special Relativity only when it is covariant under the action of Lorentz 
transformations, which we assume to be the case. What this assumption implies on 


the four quantity ( x) ? 
Alyy 
If the rhs was not zero then by the quotient theorem the four quantity A‘ = 


V) 5 is a four- 


Bc 
( 2) should be a four-vector, because the operator (4, 
x 
vector. But the rhs vanishes and the quotient theorem does not apply, because 
zero is covariant wrt any homogeneous transformation (the Lorentz transformation 
included!). 


However although the equation of continuity does not determine the Lorentz 


. . Be ‘ : i 

covariance of the four quantity ( - ) it does say that the assumption A‘ four 
= 

vector is acceptable. In this case the equation of continuity is written A’, = 0 it 


is Lorentz covariant and has a physical meaning in Special Relativity. The physical 
meaning we give it, is that the four-vector A’ represents a current of a physical 
quantity and the equation of continuity expresses the lack of charge for this current, 
or equivalently, the lack of sinks and sources for this physical field. 

The consideration A’ four-vector is not the only possible and of course need not 
be compatible with other transformations!* which the Euclidian quantities A, Bc 
may obey. 


'4Tndeed let us assume that the four quantity (Bc, A)» transforms from a LCF © to the LCF 
according to the rule: 


where «(v) is a function of relative velocity v of the frames X, X’. Then relation (13.34) is 
preserved in the new frame. Indeed: 


Ai — A” = K(v)A! 
therefore: 
Al, = Ovy(e(v) A‘); = Ove) A! 


where Q(v) is some function of the relative velocity resulting from the Lorentz transformation. 
This condition is not Lorentz covariant except if and only if Q(v)«(v) = I. 


13.4 The Equation of Continuity 433 


In electromagnetism the equation of continuity is the following: 


V-j+—=0 
eee 
where j = pu is the conduction current in a LCF %, p is the density of the charge 
in & and u is the velocity of the charge in ©. This equation gives in © the four 


quantity: 
ee (CC) Pe) ote) eg (13.35) 
JJ/>s puyy y \Usy Y 


where u’ is the four-velocity of the charge. Therefore the requirement j' to be a four- 
vector implies that the quantity 9 = & must be Lorentz invariant. The invariant 9 
we call proper density of the charge and assume that measures the density of the 
charge in the rest frame of the charge. The p = poy is the density of the charge 
in the LCF &. We note that p > po, that is, we have “charge dilation”.!° The four 


vector j' we call conduction four-current and we define it covariantly as follows: 
j' = pou’. (13.36) 
The equation of continuity in electromagnetism reads: 


F220. (13.37) 


‘S]t is important to stay for a moment at this point. Indeed so much effort and discussions have 
been done to the time dilation whereas the charge dilation has passed unnoticed. Obviously this 
is due to the facts that (a) time is considered to be (unreasonably!) more fundamental than charge 
and (b) Time concerns everyone therefore it is more familiar to the common reason than charge. 
Special Relativity proposed a new view of cosmos, because it rejected the absolute character of 
space and time, hence it effected the concept of “omnipresent mind” which is — in one way or 
another — at the roots of all social structures. As a result this theory raised much reaction and 
provoked many “doubts” and paradoxes during the first years of its existence. Later theories of 
Physics General Relativity and especially Quantum Mechanics where considered to be a matter for 
physicists only. Indeed the first was concerned with the large scale universe and the second with 
the very small scale. The realm of application of both theories was therefore too far from our direct 
sensory experience therefore they did not bother the wide social scale. Thus they were accepted 
without wide social reaction. However needless to say that the development of Quantum Mechanics 
today and in the future are controlling most of our everyday activities and our social functioning. 
Newtonian Physics was founded on the archetypes of the society. After Special Relativity it is the 
Physics which creates and enforces its archetypes on the society. The solid harmony introduced 
by ancient Greeks and developed further in Europe was based on the human “structure” and the 
sensory environment. That died with the appearance of the new relativistic “point of view” and was 
replaced with technocratic lividness structure which is object oriented, that is, the “I am” has been 
replaced with “I have”. 


434 13 The Electromagnetic Field 


We assume the electric charge to be Lorentz invariant and define it in the LCF & 
with the relation: 


dQ = pdV (13.38) 


where dV is an elementary volume at the point of 3-space of & where the charge 
density is p. In the proper frame of the charge, dQ = pod Vo from which follows 
the well known transformation of 3-volume: 


1 
dV = —dVo. (13.39) 
Y 


Example 13.4.1 


a. Show that the elementary 3-volume dV under a Lorentz transformation with 
parameter y transforms as follows dV = td V+ where dV™ is the 3-volume 
in the proper frame. 

b. Let A be an invariant physical quantity (e.g., mass, charge etc.) with density 
pa. Show that under the same Lorentz transformation the density transforms 
according to the relation p4 = YOR where ro is the density in the proper frame 
of the volume. 

c. Suppose the quantity A of question b. satisfies an equation of continuity, that is, 
in an arbitrary LCF &: 


where j is a current which is related to A and py is the density of A in X. This 


equation is written: 
a 
Vv JA 


: ¢ : ; 
Assume that the four quantity J4 = ( ) is a four-vector and find a covariant 
JA 


. 


relation among J“ and the four-velocity u“ of the elementary volume dV in ©. 
Finally show that in this case the equation of continuity of the quantity A is 
written covariantly as J“, = 0. 


Solution 


a. We consider the elementary surface dS normal to the 3-velocity u of the 
elementary volume in the LCF & and we have dV = dSdl. Under the action 
of the Lorentz transformation dS = dSt and dl = yal* from which follows 


—lgyt 
dV = dv ; 


13.4 The Equation of Continuity 435 


b. Because A is invariant we have: 
padV = p,dV* => pa=ypj.- 


CPA 


cpa \ _ c 5 ye 
ee ) oe i) ~ PA ee 


The only four-vector whose zeroth component equals yc in the arbitrary LCF & 
is the four-velocity u“ of the elementary volume dV. It follows that j4 = pau 
where u is the 3-velocity of the elementary volume dV in &. We infer that the 
four-current of the quantity A is defined covariantly with the relation Ji = piu 
where u“ is the four-velocity of every elementary volume dV of A. In terms of 
the four-current the equation of continuity is written as the Lorentz product: 


c. Assuming that the quantity ( ) is a four-vector we have: 


a ay® 
(Ss 4yqcs =4 


dct ~ 9x0 +V SE = Sh a 


Example 13.4.2 


a. A current J flows through a straight conductor of infinite length. Calculate the 
magnetic field in a distance r from the conductor. 

b. A straight conductor of infinite length is charged homogeneously with a positive 
charge of density p. Calculate the electric field at a distance r from the conductor. 

c. In the LCF © the current J flows a straight conductor of infinite length while a 
charge q moves parallel to the direction of the conductor with constant speed u 
at a distance r from the conductor. Study the motion of the charge in the proper 
frame of the conductor and the proper frame of the charge. 


(It is given that ¢9 = 8.85418 x 10-7 cor (the useful quantity is -- = 9.0 x 
Lo Naw) and ju = 4x x 10-7 Weber), 
olution 


a. According to Ampére’s Law the magnetic field B which is created by a current J 
is given (in the SJ system of units) by the relation 


§ B-dl=pol (13.40) 


where the integral is computed over a smooth closed curve in the space of the 
magnetic field and d] is the element of length along this curve (see Fig. 13.1). 

In the case of a straight conductor of infinite length, due to symmetry, the 
magnetic field at a point P in space must have a direction normal to the direction 
of the conductor and a strength depending on the distance r of the P from the 


436 13 The Electromagnetic Field 


Fig. 13.1 Ampére’s Law magnetic field 
lines 


fo B 


conductor. In order to compute the magnetic field at the point P we consider the 
plane thorough P which is normal to the conductor and in that plane we consider 
a circle centered at the conductor with radius r. Then B = Bey and dl = dle&y 


hence: 
$B -dl =f Brdg@=2nrB. 


Ampére’s Law gives: 


where (€,, 4, €9) is the right handed system of spherical coordinates. 

b. If a charge, Q say, is enclosed in the interior of a smooth closed surface S, then 
the electric field E on the surface is given by Gauss Law, which in the SJ system 
of units has as follows: 


fE-ds=— 0 
£0 


where dS is the elementary surface at a point P of S and E is the value of the 
electric field at the point P. 

In the case of a straight conductor of infinite length with charge density o we 
consider a cylindrical surface with axis the conductor, radius r and length 2/ and 


have: 
1 
E-as =I E2zxrdl. 
-I 


The E || dS due to symmetry, that is, E= Eé;, S = dSé;. Gauss Law gives: 


fo 
2mreéo 


13.4 The Equation of Continuity 437 


where p = 2 is the linear density of charge. The direction of E depends on the 
sign of the chitigs density. 

c. In the proper frame of the conductor, & say, the current flowing the conductor 
creates a magnetic field. There does not exist a electric field because the metallic 
conductor is electrically neutral. Indeed the conductor consists of positive ions 
of charge density 04, which have velocity zero and free electrons of density 
p— = p+ which move with small velocity v, opposite to the direction of the 
current. In the frame & we consider a charge g which moves with velocity u 
parallel to the direction of the current in the conductor, therefore it suffers the 
Lorentz force: 


F=quxB. 


= mole 


As we saw in a. the magnetic field B = ey therefore: 


where e, is the unit normal with direction from the conductor to the charge. We note 
that the charge is attracted from the conductor.!® 

We consider now the motion in the proper frame of the charge, D’ say. In that 
frame the densities of the positive ions p/, and the free electrons p!_ are not equal and 
they must be calculated from the Lorentz transformation relating &, x’, Suppose 
that in &’ the four-current of the positive ions j{ and that of the free electrons j* 
are: 


p_ Pye 
a @ 2 Pau 
mms (ee) A de= li o 
OP Oo 2a 


We know that in the proper frame & of the conductor these four-vectors are: 


pt pic 
: ae Vv on 0 
—_—— 0 ’ J, = 0 

0 


x 


'6From this result we can compute the force between two straight parallel conductors of infinite 

length carrying currents /; and /2. Indeed the first conductor (the J; say) exerts to the elementary 

length dl» of the second conductor the force F2; = — danunole, But the length d/> carries charge 

dqz = Indt so that dq2u = Indlo, therefore the force per unit length on the current /> due to the 
Fo) — boll be 


current J) is: Ge = —So+ 7 . This force is obviously attractive. If the currents have opposite 


direction of flow that force is jeulenie: a phenomenon with important practical applications. 


438 13 The Electromagnetic Field 


The Lorentz transformation gives: 


’ Uu uv 
p_C= Yu (p-c = ~p-v) = Yup-c (1 = =) 
c c 
P1.C = YuPtC = Yup—c 
(because in © p1 = p_). The overall charge density in X’ is: 


roo ro uv 
p ia AR em al 


This means that the conductor appears to be charged in ©’ although it is neutral 
in &! This fact is completely at odds with the Newtonian point of view and it is 
due to the fact that the charge density (not the charge!) is not Lorentz invariant in 
Special Relativity. 

This charge of the conductor creates at the position of the moving charge the 
electric field: 


/ 


FE! ae oe ee 


— _—— Tr 
Qnr'eg ’ — 2arr'Eqc2 


But r’ = r because r is measured normal to the velocity u. Also egc? = aa 
Replacing we find: 
—v)U. 
i OS. 
2mr 

Now p_v = j_ = —I where / is the current through the conductor, Therefore: 

boul. 

E' = —Yu e; = —YyE 
2nr 


The force which is exerted on the moving charge in &’ is due to this electric field 
and equals: 


Ft = gE’ = -y,qE=— y,F 


where F is the force in the charge in &, which is due to the magnetic field created 
by the current J. 


13.5 The Electromagnetic Four-Potential 439 


This result is compatible with the transformation of the four-force. Indeed in & 
the product F - u = 0, hence the four-force on the charge g has components: 


0 
Fi = Yu¥ | = 0 
YuF 1 = —Yu¥ zy 


In the LCF &’ the four-force is: 


y/ 


The Lorentz transformation relating X&, X’ gives: 


F, = vy (0 — ByuF |) =0 
a = —Yu¥ 


which coincides with the previous result. 

We note that in the proper frame &’ of the moving charge there does exist 
magnetic field created by the current flowing the conductor, but this field does not 
exert a force on the charge because in &’ the charge is at rest. 

In order to compute in ©’ the current which flows through the conductor we 
consider the currents of the positive ions and the free electrons and we have for the 
net total current I’: 


P=I +, =—plv' + plu =—yp_(v — u) — yyp-u = —yyp—v = Yul 


where J is the current flowing the conductor in &. 
Question: Is this result compatible with the Lorentz transformation of the four- 
current? 


13.5 The Electromagnetic Four-Potential 


In the last section we wrote Maxwell equations for empty 3-space in an arbitrary 
LCF & in terms of the potentials A, ¢. In the present section we shall write these 
equations in terms of the four-vector formalism. 


440 13 The Electromagnetic Field 


For convenience we collect the obtained results in the following Table: 


Maxwell Equations Electromagnetic Potentials 

V-E=£ E=-v¢-% (13.13) 
vV-B=0 B=VvVxA (13.11) 
VxB=mojtb# vo-L*¢-_2 (13.18) 
vxE=-8 VA — 4 EA = —poj (13.19) 
Lorentz gauge v-A¢4t Me = 0. (13.17) 


We note that the equations giving the potentials A, ¢ in © are written (u9eo = 


1/c”): 


ae g 
Lorent :( 77 ) .(e) = 13.41 
orentz gauge ( e). ia: 0 (13.41) 
micas 
¢ ce . 
Maxwell Equations : () = = —Uo/'. (13.42) 
. Lol} 5, 


@ 
Equation (13.41) is a continuity equation with “current” ( + ) , which is 


compatible with the assumption (—@/c, A) four-vector. Equation (13.42) implies 


(due to the quotient theorem) that (—¢/c, A) must be a four-vector, because the 


ae q % 
operator =v- 44 is Lorentz covariant and the rhs is the four-current. 


We conclude that there exists a new four-vector, which we denote Q! and call the 
electromagnetic four-potential which in the LCF & has components: 


@ ¢ 
QV= () & Q) = (-£.4) ‘ (13.43) 
A z c ry 


In terms of the electromagnetic four-potential Q' Maxwell equations are written 
in covariant form as follows: 


OF =0 (13.44) 


Q§ =O"! = —poj'. (13.45) 


13.5 The Electromagnetic Four-Potential 441 


It is apparent that the knowledge of ! is sufficient to determine the electromagnetic 
field. Indeed let us compute the fields E, B in ¥ in terms of the components of Q! 
(in &!). If in equation (13.13) we consider: 


ct = x°, x=x!, y= x’, z=x3 


and use that Q;,; = oor it follows easily that!’: 


1 
Bs = 9,1 — Q1,0 


1 
a => Q0,2 = 22,0 


1 
a = 00,3 — 23,0. 


Similarly from equation (13.11) we have: 
By = 03,2 — Q2,3 
By = Q1,3 — 23,1 
Bz = 22,1 — 21,2. 


The above relations can be written collectively in the form of a matrix as 


follows!8: 
0) —E,/c —Ey/c —E;,/c 
a . = = Q)4 —Qi,;. (13.46) 
yiG =z is 
E,j/¢ By — B, 0 /s5 


'7Tn a compact formalism equation (13.13) is written: 


Ey = —o,n —cAp.o 


Now (13.43) gives: 
Qo = -£, Qu = Ap 
from which we compute: 
Qo, = — Ons Quo = Ap,o 20, — Quo = “Os tcAyo) = i 
For the magnetic field we have from (13.11): 
BE = —e#" Atyoy = —eH YP Qiyp)- 


'8 According to our convention the first index of a matrix counts rows and the second columns 


442 13 The Electromagnetic Field 


The rhs of equation (13.46) is an antisymmetric tensor F;; of order (0,2) which 
we name the electromagnetic field tensor and define as follows: 


Fij = Qi — Qi. (13.47) 


The components of Fj; in the LCF © are: 


0 —E,/c —E,y/c —E,/c 
Ex/c 0 B, — By 
Fi, = ‘ . : 13.48 
‘ Ey/c —B, 0 By em 
E-/c By — By 0 zr 


From the above we infer the following conclusions: 


a. Ina LCF »& the fields E, B are parts (components) of a more general field which 
is the electromagnetic field tensor F;;. This is similar to the Newtonian space 
and Newtonian time which are parts (components) of the relativistic spacetime 
position vector. 

b. The components of a tensor depend on the LCF in which they are computed. 
Therefore it is possible that the electromagnetic field in a LCF it is expressed in 
terms of an electric field, in another in terms of a magnetic field and in another in 
terms of both an electric and a magnetic field. For example the electromagnetic 
field of a charge in the proper frame of the charge is expressed by an electric field 
and in a frame where the charge is moving in terms of an electric and a magnetic 
field. This fact cannot be explained in terms of Newtonian Physics because there 
these vector fields are covariant therefore if they vanish/do not vanish in one 
frame they venish/do not vanish in all Newtonian frames. 


In the following the electromagnetic field tensor due to the fields E, B we shall 
indicate by Fj; = (E/c, B). Because the units of the components of a tensor must 
be the same, the magnetic field B must always be related to E/c. 


Exercise 13.5.1 Show that the potentials @ and A under a Lorentz transformation 
transform as follows: 


Al = Ay 

, u 
Ay=r[Ai- 59 
¢' =y(o—u- Aj). 


[Hint: Consider the electromagnetic four-potential Q'.] 


13.6 The Electromagnetic Field Tensor Fj; 443 
13.6 The Electromagnetic Field Tensor F;; 


The electromagnetic field tensor is a powerful tool which enables us to use geo- 
metric methods and the Lorentz transformation to answer a number of fundamental 
questions concerning the electromagnetic field. In this section we shall deal with the 
following questions: 


(a) Consider an electromagnetic field F;; which in the LCF © is represented with 
the fields (E/c, B) and in the LCF &’ with the fields (E’/c, B’). What is the 
transformation between the fields (E/c, B) and (E’/c, B’)? 

(b) How one can write the electromagnetic field equations (13.44) and (13.45) in 
terms of the electromagnetic field tensor F;;? 

(c) What are the invariants of the electromagnetic field (under Lorentz transforma- 
tions)? 


13.6.1 The Transformation of the Fields 


The electric field E and the magnetic field B are not parts of four-vectors therefore 
one cannot use the Lorentz transformation relating & and ©’ to compute their 
transformation. Instead one must use the electromagnetic field tensor Fj; whose 
these fields are components. If i is the Lorentz transformation relating © and D’ 
the components of F;; are transformed as follows: 


Fyj = LLLP, (13.49) 


Let us compute the transformation of the fields for the case of a boost along the 
common axis x, x’ with velocity u. For a boost we have the matrices: 


y By 00 y —By 00 
By y 00 i —By y 00 
Li, = : Li = 13.50 
i 00 10 : 0 0 10 ( ) 
00 O01 0 0 01 
hence: 
— = Foy = Ly Ly Fj = Lo Ly Fo + Lo Ly Fo = —y? > + 976 = —— 
iJ i 72 0 1 Ey 
By = Fyy = Li, Ly Fig = Ly Ly Pia = Ly Foo + Ly Fi2 = y | Bz — : 


of . E, 
— By = Fyy = Lila Pi — LL3,Fi3 — L®, Fos t LF => Y G 1 B =), 


Similarly we compute the remaining elements of the transformed matrix Fi ;’. 


444 13. The Electromagnetic Field 


A practical method to compute the transformation of the components of the 
tensor Fj; in &’ is the following.!? The tensor F;; in the LCF & and »’ is 
represented with the following matrices: 


0 —E,/c —Ey/c  —E;/c 


E,/c 0 B, — By, 
[Ral= | 2 
y/c —B, 0 By 
E,/c By — By 0 
0 -Ey/c —Ey/e —Ez/c 
Ey /c 0 By — By 
[Fi j"] > is } © y 
yle — By 0 By 
Ez/c By — By 0 y 


Then the Lorentz transformation of the tensor F;; is written as the following matrix 


equation: 
[FigisrS [Li LFjlslL4] (13.51) 


where the upper index ¢ indicates transpose. The Lorentz metric for a boost is 
symmetric therefore in the rhs we have to consider only the matrix L‘,. Introducing 
the matrices in the rhs we find: 


0 —E,/c y(=2— BB, —y(# + BBy) 
E,/c 0 y(B, — B=) —y (By + B=) 
Fy = 
y(2—pB.) -y(B.—- B®) 0 By 
y(#2+BBy) y(By +B) —By 0 


(13.52) 


If we replace Fj; in terms of E’, B’ we obtain the transformation equations of the 
electric and the magnetic field: 


Ey/c = Ex/c By = B, 
Ey/c = yv(Ey/c — BB,) By = y (By + BE-/c) (13.53) 
Ev/e= y(E,/c + BBy) By = y(B, — BEy/c). 


!9See also Exercise 1.7.3. This practical rule applies to tensors of type (0,2) and (2,0)! It is possible 
to be generalized to tensors with more indices however, this is outside our interest. 


13.6 The Electromagnetic Field Tensor Fj; 445 


Another useful relation among the pairs (E/c, B), (E’/c, B’) relates the parallel 
and the normal projection of the fields along the direction of the relative velocity u 
of = and &’. Indeed by witting u = ui we have from (13.53): 


E; = E,i = £,i=E 
E| = Eyj + Eyk’ = y (Ey — uB)j+ y (Ez, + uBy)k = 
= y(Eyj + Ek) + yu(—B,j + Byk) = y [E, + (ux B)1]. 


Similarly we work with Bi I? B’ . Finally we obtain the relations: 


E, = 5, 
FE) =y(E, +uxB] (13.54) 
Bi = By 


1 
Bi = |B. - uxe]. 
c 


It follows that the projections of the fields E, B along the direction of the velocity 
u do not change whereas the normal projections are changing by the factor u x 
B, —u x E respectively. Relations (13.54) are more general than (13.53) because 
they apply to a general relative velocity u and not only to a boost. 


Exercise 13.6.1 Prove that the transformation equations (13.54) can be written 
equivalently as follows: 


l-y 


E’= yE+ —, 
u 


(u- E)u+y(u x B) (13.55) 


B = yB+ 


Y u-Bu- svt x E). (13.56) 


uz 


13.6.2. Maxwell Equations in Terms of F;; 


From the definition Fj; = Q;,; — Qi,; we have by cyclic permutation of the indices: 
Fijk = —Qi, jk + Q),ik 
Fik,i = —Q5,ki + Qk, ji 
Fri, j = Qk, ij + Qi,k;- 


446 13 The Electromagnetic Field 


Adding we find the equation: 
Fija + Fix + Fri,j = 9. (13.57) 


This is one of the equations of the electromagnetic field which essentially is 
equivalent to the existence of the electromagnetic potential.” 

The second equation relates F;; with the four-current j’ and it corresponds to the 
main Maxwell equation DQ! = —19'. In order to find this new equation we write: 


Fi =-9,/4+0!, 


U 


and differentiate wrt 7 : 


7 J J 
F; . a —Q; ar Q aj’ 


From the equation of continuity (13.44) the term Q! i Q/ = 0 whereas from 


equation (13.45) the term OF -= Q; = joj". It follows that the second Maxwell 
equation in terms of the electromagnetic field tensor Fj; is: 


Fu, = —p9j!. (13.58) 


13.6.3 The Invariants of the Electromagnetic Field 


The transformation relations (13.53) and (13.54) enable us to compute the electro- 
magnetic field in any LCF if we know it in one. This is a characteristic of Special 
Relativity, which makes possible the solution of a problem in a special LCF where 
the answer is either known or most convenient to be found and then transfer the 
solution to the required LCF. In electromagnetism this special LCF is selected (as a 
rule) by the invariants of the electromagnetic field. 

The electromagnetic field is described completely by the electromagnetic field 
tensor, therefore the invariants of the field are given by the invariants of the tensor 
F;;. The tensor F;; being antisymmetric has two, and only two, invariants given by 
the following expressions: 


1 


X= =e Fi and Y= nije FY pe (13.59) 


where jx; is the completely antisymmetric tensor with four indices.7! 


201t is the so called Frobenius condition which need not worry us further. 


2! The tensor nijxi Where i, j,k, 1 = 0, 1, 2, 3 equals zero if two of the indices have the same value 
and +1 if ({jk/) is an even or an odd permutation of (0123). See Sect. 13.10.1. 


13.6 The Electromagnetic Field Tensor Fj; 447 


In a LCF © these invariants are computed in terms of the electric and the 
magnetic field. Indeed considering the components of F;; in & we have: 


1 Ay 
x= aolge? = —(Fo Fo! + Fon FP? + Fo3F@ + Fro Fk? + Fi3F + Fo3F7) 


! 21 p24 p2 
a a2 LEx( Ex) + Ey(—Ey) + Ez(—E;)] — [By + By + Bz] 


x z 


1 
= —F -B (13.60) 


Cc 


1 - 
Y= grin FY pra [mors FOF + i230 F 1? F° + na3o1 F°3 FO" | 


|= —E, == 
=4 - (—B,) — B; + (—By) — 
Cc 7 Cc 
— _E.B. (13.61) 


In the next example we show that the quantities FE — B? and lE - B are indeed 
invariant under Lorentz transformations. 


Example 13.6.1 Consider two LCF © and ©’ which are moving in the standard way 
and let E, B and E’, B’ the electric and the magnetic field in © and &’ respectively. 
Prove that: 


1 1 
E-B=E'-B’ and B? — —E? = B” — SE”. 
Cc G 


1st Solution 
From the transformation equations (13.53) we have: 


1 I 
-E'- B= [ E,B, + BB, + EB, 


Cc 


“(Ex By +7 (By/e~ Bz) (By + BE:) 


+ y? (E,/c + BBy) (B, — BEy/c)] 


1 
= c [ Ex Bs F (7? = y°p?) Ey By Fr (7? = y°B?) E-B.| 


1 
[ (Ex By + Ey By + EzB;) 


1 
= -E-B 
G 


448 13 The Electromagnetic Field 
Also: 


1 1 
BE) — (BY = s(Ey + Ey + Ep) — (By + By + BY) 


1 1 1 z 
ho + 2(2e - 6B.) + (Ze + 6By) 
zex TY y z Y 4 y 
Cc G Cc 
ee t. \" 
— Re aye 2 ey: pe es 
B.-y (8, +626.) Y (2. py) 
oy?p LE. By —2y2p1E 
+ a zBy or zBy 
1 2 2 2 Q2 1 2 2 2 Q2 1 2 
= kets (2-8) kes (P-7') be 
=e (y?- yp?) a (y?- yp") B? 
1 
= ms + E> + E?) — (B2 + B? + B?) 
1 


= —K? — B?. 
C2 


2nd Solution 
In this solution we use the transformation equations (13.54) which define the 
general Lorentz transformation, therefore the proof is completely general. We have: 


1 
E.R = [Ey + y(EL+v x B)| ; [Bi +y(BL- av x B)| 
2 2 1 
= Ej By + y°E, By — Gv x B)-(v x E). 
From classical vector calculus we have the identity: 


(A x B)- (Cx D) = (A- C)(B-D)—- (A-D)(B-C). 


Using the above identity we compute: 


I 
av x B)-(v x E) =6°(E- B)—p°E, - By. 


13.7 The Physical Significance of the Electromagnetic Invariants 449 


Replacing we find: 


E’- Bl = (1+ y76°)E| - By + y°E, -B, — y*p°(E-B) 
= y*(E,; -B, +E, -B,) — y*B°(E-B) = (y? — yp*)(E- B) 
=E.-B. 


In the same manner one proves the second relation. 


13.7. The Physical Significance of the Electromagnetic 


Invariants 


The invariants of the electromagnetic field contain significant information con- 
cerning the character of the electromagnetic field and can be used to classify the 
electromagnetic fields in sets with similar dynamical properties. 

As we have seen the electromagnetic field has two invariants, the quantities X = 
SE’ — B? and Y = lE - B. X measures the difference of the strengths of the fields 
and Y their angle. Therefore we consider the following four cases: 


(i) 


(ii) 


(iii) 


X=Y=0 
The electromagnetic fields with vanishing invariants are characterized with 
the properties: 


1 
-|E|=|B| and ELB 
Cc 


that is, in all LCF the electric and the magnetic field have the same strength 
and they are normal to each other. These electromagnetic fields we call null 
electromagnetic fields and we assume that they describe electromagnetic 
waves. 
X #0, Y=0 

The electromagnetic fields with invariants X ~ 0, Y = 0 are characterized 
by the fact that in an arbitrary LCF either the electric and the magnetic field 
are perpendicular and have different strength or one of them vanishes. Which 
of them vanishes we find from the sign of the invariant X. If X > 0 then only 
the magnetic field is possible to vanish and if X < 0 then only the electric field 
is possible to vanish. 
X=0,Y 40 

The electric and the magnetic fields of these electromagnetic fields have 
equal strength in all LCF e |E| = |B)) and they are in none LCF perpendicular. 
If Y > O then (in all LCF!) their directions make an acute angle and if Y < 0 
this angle is obtuse. In both cases there exists always a family of LCF in which 
E | B. 


450 13 The Electromagnetic Field 


(iv) X40, Y#O. 
This case is similar to (iii) with the difference that the strengths of the fields 
E, B are different in all LCF. 


From the above analysis we conclude that we should consider three types of 
electromagnetic fields: 


(a) Electromagnetic waves (X = Y = 0). 

(b) Electromagnetic fields with electric or magnetic field only or both fields 
perpendicular and with different strength (X 4 0, Y = 0). 

(c) Electromagnetic fields with both electric and magnetic field (X = 0 or X F 
0 and Y ¥ 0) for which there exists a family of LCF in which E || B. 


Because the invariant Y determines if the electric and the magnetic field are 
perpendicular or parallel, in our study we consider the cases Y = O and Y 4 0. 


13.7.1 The Case Y =0 


Consider an electromagnetic field whose invariant Y = 0. We prove that there exist 
families of LCF in which the electromagnetic field has only electric field (if X > 0) 
or only magnetic field (if X < 0). 

Suppose that in some LCF © the electromagnetic field has E = 0 (respectively 
B = 0). Then in all LCF ©’ with velocities || B (respectively || E) the field has only 
magnetic (respectively electric ) field. 

Consider now an arbitrary LCF &, in which E ¥ 0 and B ¥ 0 and assume that 
X <0. We consider the LCF ©’ with velocity 


u = a(E x B) 


where @ is a parameter which has to be determined. Because the velocity is 
perpendicular to the fields we have: 


E| = By = 0. 
From the transformation equations (13.54) we find for the field E’ in &’: 
E| = By = 0 


E), = y {E,-+u x B}= y {E+a [(E x B) x B]} = (1 — @B”)E 


13.7 The Physical Significance of the Electromagnetic Invariants 451 


where we have taken into consideration that Y = iE - B = 0. Because we have 
assumed X < 0 we can define the LCF &’ with the condition E’ = 0, that is, there 
exists only magnetic field. Then the last equation gives: 


1 
w= 1/B° > u= 55 (E xB). 


The magnetic field B’ in X’ is given form the relation (13.54): 


B’ = Ru eRe = pe B 
=Vu C2 = Vu c2B2 


and it is B’ || B and also B’ Lu. 

Therefore LCF &’ belongs to the set of LCFs in which there exists magnetic field 
only. As we have shown above, in all LCF x” with velocities u’ = AB’ relative to 
»’, where A is a real parameter, the electromagnetic field has also only magnetic 
part. In order to compute the velocity v say, of x” wrt © we apply the relativistic 
rule of composition of 3-velocities. From Example 6.3.1 we have: 


- 1 AB’ u- AB’ \ 
Sg 


where Q = | — nip = |. Replacing we find: 


E? 1 FE? 
v=u+all og B= 55(E xB) +2 een B. 


In case the invariant X > O then working in a similar manner we find: 


and requiring that B’ = 0 we find the velocity: 


C2 
5 (E x B). 


u= — 
E 


The electromagnetic field in &’ has only electric field, which is given by the relation: 
j cB 
E=y(E+vxB)=y La E. 
In this case the LCF D’ is not unique and all LCF &” with velocities u = 


SE xB) +4(1- Bre 


) E have the same property. 


452 13 The Electromagnetic Field 
13.7.2 The Case Y #0 


In this case the electric and the magnetic field do not vanish and they are not 
perpendicular in any LCF. We shall show that in this case there exists a family of 
LCF in which E || B. 
We consider a LCF &’, which relative to the LCF © has velocity u normal to the 
plane defined by the fields E, B in &. We write: 
u=a(E x B) 


where q@ ia a parameter which must be determined. From the transformation 
relations (13.54) we obtain for the electric and the magnetic field in &’: 


EF’ =y(E+u~xB) 


1 
B = (B- Sux). 
c 


We compute”: 


u x B=a(E x B) x B=acYB—aB*E 
ux E=a(E x B) x E= —acYE+aE’B. 


We replace this result and find: 


Ey [a = oB)E+ocYB| (13.62) 
; E? 1 

B =y|(1—a—)B+aY-E]. (13.63) 
c2 Cc 


We note that the fields E’ and B’ are in the plane of E, B, therefore they are 
perpendicular to the velocity u. The cross product: 
E2 

E xB =)? t — aB?) (1 = a) = oy] (E x B). 

c 


The condition the electric and the magnetic field to be parallel in X’ is: 


E’ x B’ =0 


2We are using the identity (A x B) x C = (A- C)B— (B-C)A. 


13.7 The Physical Significance of the Electromagnetic Invariants 453 


which implies the condition: 
E2 
(1 —aB’)(1 -aw—) —a°Y* =0 
c 


or, after some easy algebra: 


1 2 p2 2 2 2 E 
Cc Cc 


But: 
4 _l t Zoy0 a ee eee 2 
Y=-B-E=-BEcosg> 5 ETB Y* = SEB’ sin g=|-E~x BI. 
c c c c c 
Also a = eT Replacing we find the following equation for the parameter 6: 
B24 E 
2 2 
+1=0 
Bp | Ex B iP 


1 
B=5 [4 A 4| (13.64) 
p24E 
where A = an > 2. We conclude that the velocity of X’ relative to ¥ is: 
: 1ExB 
u = Bc——_——_ 
| 1E x B| 


where f is given in (13.64). 
It remains to compute the proportionality factor relating the strengths of the two 
fields. We write: 


a E'-cB’ c’Y 
B =i,-E >i = —— = = 
c 
For the length E’ ? we have from (13.62): 
E? =,” [E —OwB 8? 4 20eY" +078 BH acy? B?| 


= y?E* [1 — 2a B? sin? g+ a? B4 sin? | 


454 13 The Electromagnetic Field 


hence: 


_ cBcos@ 
y2E[1—2aB? sin’ y + «2 B4 sin” y)] 


&’ is not the unique LCF in which the electric and the magnetic field are parallel. 
Indeed it can shown easily that the same holds in any other LCF in which the relative 
velocity wrt X’ is u’ = xe’, where « ia a real parameter and é’ is the unit vector in 
the common direction of the fields in &’. In order to compute the velocity, u say, 
of these LCF relative to & we consider the relativistic rule of composition of 3- 
velocities and have: 


=saleare->+7]} 
aes u+v Ger J+y 


where Q = | — uy But v- u’ = 0 because the vector é’ is normal to the vectors 
E’, B’ hence QO = 1 and: 


13.8 Motion of a Charge in an Electromagnetic Field: The 
Lorentz Force 


Consider a charge g and assume that in the proper frame of the charge =* there 
exists an electric and a magnetic field given by the vectors iEt, B*. We define 
the 3-force on the charge in £* due to this electromagnetic field to be f* = qE*. 
This definition coincides with the Newtonian force on a charge resting in a electric 


and a magnetic field.** Then in the proper frame of the charge the four-force F! = 


qe* 
the appropriate Lorentz transformation. 

However instead of using the Lorentz transformation to compute the 3-force in an 
arbitrary LCF, © say, it is more fundamental to write the four-force F' covariantly 
in an arbitrary LCF and then compute the 3-force on the charge in & by taking the 
components of the quantities involved in X. 


( a ) is completely determined and can be computed in any other LCF using 
yt 


3Recall that the definitions of physical quantities in Special Relativity are made in the proper 
frame and coincide with the corresponding Newtonian quantities (provided they exist). This is the 
general rule — strategy for defining physical quantities in Special Relativity and it is justified by 
the facts (a) that the physical quantities we understand and manipulate in the Newtonian world 
and (b) Special Relativity must be understood physically in the Newtonian world. The difference 
introduced by Special Relativity is in the transformation of these quantities from LCF to LCF. 


13.8 Motion of a Charge in an Electromagnetic Field: The Lorentz Force 455 


In order to do that we note that the four-velocity u' of the charge g in E+ has 


components [u'] = (5) and similarly the electromagnetic field tensor in X*: 


y+ 
0 —Et/c —E}/c —Et/c 
Et/c 0 Br — BY 
yl =) ee BY 8 Bt 
y /c ang x 
BL je: BS — BY 0 se 


Then using the definition of the four-force in 2* we have (in =*!) the relation: 
F; = qFiju’. (13.65) 


But j’ = qu’ is the four-current due to the charge. Therefore the four force acting 
on the charge g (in ©*!) is written covariantly as: 


F, = Fj’. (13.66) 


This relation has been proved in ©* but because it is covariant it is valid 
in any other LCF, therefore it defines the four-force on the charge due to the 
electromagnetic field.** 

In order to compute the force on a charge moving with velocity u in an electro- 
magnetic field E/c, B in a LCF & we work as follows. In & the electromagnetic 
field tensor is: 


0 —E,/c —Ey/c —E;/c 
E,/c 0 B, — By 
[Fij] = xf i > 
E,/c —B; 0 By 
E,/c By — By 0 = 


and let the four-velocity of the charge u! = ( ) . We know that the four-force 
Yuu 
| phatase 7 
F' due to a 3-force f in & is given by the expression: F’ = ( ig “ ) where € is 
Yul Js 
the rate of change of the energy of the charge in © under the action of the force f 


*4Tn case we have a charge density p instead of a single charge, then the four-vector F' is the 


density of four-force and ;' is the density of the four-current of the charge j! = ic) : 
Js 


456 13 The Electromagnetic Field 


(in &!). Replacing this in the lhs of equation (13.65) we find: 


0 —E,/c —Ey/c —E;,/c 
Ge - Ex/c_ 0 B- — By (“) 
yf / 5 ot Ey/c —B, 0 By Yuu) ys 
E./c By — By 0 - 


a ( teu 
~ 4%u E+uxB 5 


From this result follow two important conclusions: 


¢ The magnetic field does not change the energy of the charge (does not produce 
work). Work is produced only by the electric field E and it is given by the relation: 


ee E (13.67) 
a 
¢ The 3-force on the charge g in & due to the electromagnetic field E/c, B (in X!) 
is given by the formula: 


f=q(E+uxB). (13.68) 


This force is known as_ the Lorentz force to remind Lorentz who first gave 
empirically this formula prior to the invention of Special Relativity. We note that 
this equation does not contain the factor y a fact which sometimes obscures its 
relativistic origin. However as we have shown this is a purely relativistic formula. 

The equation of motion of a relativistic point, P say, under the action of a four- 
force F' is given by the relativistic generalization of the Second Law of Newton: 

Re dp' 
dt 


(13.69) 


where p’ is the four momentum of the mass point and is its proper time. In a LCF 
l¢ 
EF = & ~) p= () and this equation gives: 
2 P/s 


yf 
(i), 78) 
Yul >} dt p >>) 


where ¢ is time in & and y = a The zeroth coordinate gives: 


E=f-v (13.70) 


13.9 Motion of a Charge in a Homogeneous Electromagnetic Field 457 


and the space coordinate gives the equation of motion: 


dp 

f= [Pa (13.71) 

The first relation expresses the conservation of kinetic energy, that is, a motion in 

which the 3-force is always normal to the velocity, the energy — and consequently 

the speed — is a constant of motion. The second equation is the equation of motion 
and constitutes the generalization of Newton’s Second Law in Special Relativity. 

In accordance with the above the equation of motion of a charge q with velocity 

v ina LCF & in which the electromagnetic field is given by the vectors (ZE, B) is: 


d 
- —g(E+vxB). (13.72) 


The equation of motion (13.72) can be written using the results of Exercise 11.2.1: 
q(E+v xB) =mypv+mya=my(y7aj +a). (13.73) 


Another equivalent form of this equation is in terms of the proper time of the charge. 
In this case we use the relation y = a and find: 


t 
OD hag (fe cei B), (13.74) 
T T 


13.9 Motion of a Charge in a Homogeneous Electromagnetic 
Field 


In this section we study the motion of a charge q of mass m in a LCF  &, in 
which there exists a homogeneous electromagnetic field with (constant) 1k, B. The 
equation of motion of the charge is equation (13.72) or equation (13.74), whose 
solution gives the orbit of the charge in &. In general the solution of the equations 
of motion is difficult and one looks for simplifications and shortcuts which could 
make the solution possible. This is indeed the case and will be shown below. 

As we have shown in Sect. 13.7 for every electromagnetic field there are LCF in 
which the electric and the magnetic vectors of the field have one of the following 
forms, depending on the values of the invariants of the field. 


a. Only electric field (X > 0, Y = 0) 

b. Only magnetic field (X < 0, Y = 0) 

c. Electric and magnetic field perpendicular and of equal strength (X = 0, Y = 0) 
d. Electric and magnetic field parallel and with different strength (X #40, Y 4 0) 


458 13 The Electromagnetic Field 


This means that it is enough to solve the equations of motion of the charge in one 
of these cases in order to cover all possible cases concerning the motion of a charge 
in a homogeneous electromagnetic field. 

This indicates the following algorithm for studying the motion of a charge in an 
arbitrary homogeneous electromagnetic field: 


¢ Calculate the invariants 

e Determine the type of the electromagnetic field, determine the appropriate LCF 
and calculate the electromagnetic field in that LCF 

e Write and solve the equations of motion in that LCF 

¢ Transfer the solution by means of the appropriate Lorentz transformation to the 
original LCF. 


In the following we solve the equations of motion for everyone of the special 
cases (a)—-(d) above. 


13.9.1 The Case of a Homogeneous Electric Field 


In this case we use the equation of motion (13.72): 


dp _ 


E 13.75 
Ft ( ) 


and assume the initial conditions rp, po. Obviously the motion develops in the plane 
spanned by the vectors E, po. 
The first integration of (13.75) gives: 
p= gEt + po. 


We decompose the 3-momentum po parallel and perpendicular to the direction of 
the electric field E and have: 


E 
p= (qEt+ Poul) + Pol. 


The total energy € of the charge is: 


— pre? +mc4 = J @Et + po)2c?2 + m2c4t = (A + Po. c 


where A = (gEt + Po|)” + m?c?. The 3-velocity: 


cp gEt+po ; 


€ fa+Pa 


v= 


13.9 Motion of a Charge in a Homogeneous Electromagnetic Field 459 


and the position vector: 


Et 
y= f SER chen 


rer 


(qEt+ sores: E +o f Pe Poidt , 
=. eL 
(4+ Pa JA+ Po) 


where é, is the unit normal to E in the plane defined by the vectors E, po. The first 


integral gives: 
E dA E 
= —,J/A+Ph 
2q E? iad ge + 
vi A+ Po. 4 


and the second: 


dt 
/ Pol a Pon | gt + poy + Cat + pop)? +, +m? a]. 


JA+pa, 


Replacing we find: 


Ec 
r(t) = SEIV GEE + poy)? + mre? + pp (13.76) 


er In é, +C. 


gEt + po + /@Et+ poy)? + poy +m?2c? 


Taking into account the initial conditions we compute for the constant C: 


a1). 
2nd solution 
Without restricting generality we consider the x-axis along the direction of the 
electric field and the y-axis in the plane spanned by the vectors E, pp. Then E = Ei 
and the initial conditions become: 


c (EE 
C=ro —- — | — i 
0 (= + poLin |— 


Eo 
+ Pol 
Cc 


x(0) = y(0) = z(0) =0 
dct & dx _ pox dy _ Poy 


dt mc dt m ? dt m 


460 13 The Electromagnetic Field 


where €) = ,/ pac? + m?c‘ is the initial energy of the charge and r its proper time. 
We consider the equations of motion in the form (13.69) and write: 


i2 d€ 
Pre — Fin (Year |, 
dt? yqE 
But: 
d€ t pit Pc 
— = “v= —$S> = — 
r dt . ve dt a dt 


hence the equations of motion in the LCF we are working become: 


dct _ qE dx 

dt2 mc dt 

d*x  qE dct 

—_ = (13.77) 
dt = mc dt 

dry _ 

dt? 

dz = 

dt2 


The solution of these equations, with the initial conditions we have considered, is: 


E E : E 
x(t) = els (coss Ze i) + oN sinh a 
qE mc qE 


mC 
j= 2” (13.78) 
m 
Z(t) = 
E E E 
ct(t) = : sinh a + ae (cosn = = 1) 
qE mc qE mc 


In order to calculate the motion of the particle in & we express the proper time 
T in terms of the time t of &. This is done as follows. The x-component of the 
3-momentum is: 


dx & .. qgEt qEt 
Px =muy, =m = sinh + pox cosh . 
dt c mc mc 


Similarly the energy (=zeroth coordinate of the four momentum) is: 


1 Et Et 
— =mu° = — (& cosh = + cpox sinh ). 
é Cc mc mc 


13.9 Motion of a Charge in a Homogeneous Electromagnetic Field 461 


We add: 


Et 
é+ CPx = (Ep CPox) exp — > 


E : 
Ca ig (13.79) 
qe Eo + cpox 
In detail this relation is written as follows: 
mc |Pox +qEt+ J (Pos + qEt)* + mc? + Poy 
= In : (13.80) 
qE Pox + €o/c 
Replacing t in (13.78) the motion of the particle in & is found to be: 
Tf ( LpEh ene ae (13.81) 
= qE Pox T Poy 7 : 
ye) = Pt | +qEt+ J (Pos + qEty? + mc? + po, 
gE Pox + €o/c¢ 
z(t) = 0. 


It is of interest to examine the Newtonian and the relativistic limit of this result. 


In the Newtonian limit po < mc and t < “_ from which follows easily that the 


motion of the charge in & is given by the classical solution: 


; E 
ajo 47 2 
m 2m 
Poyt 
y(t) = = 
m 
z(t) = 0. 


In the relativistic limit for large times t >> “4 the speed of the charge approaches 


the value c even if originally small. In this limit the motion of the charge is: 


(t) t mc 
x(t) = ct — — 
Cc gE 
2\g|Et 
pa Oe (13.82) 
qE mc 


z(t) = 0. 


462 13 The Electromagnetic Field 


Fig. 13.2 Motion of a charge in a homogeneous electric field 


In Fig. 13.2 it is shown the distance x(t) for the initial conditions po, = poy = 0 
and in addition there are indicated the Newtonian (t < iE) and the relativistic limit 
(t> aE) of motion. 

In order to compute the equation of the trajectory of the charge in & we eliminate 


the proper time t from (13.78) and find: 


& E E 
pet (cost a i) pe eh (13.83) 
qE CPOy qE CPoy 
In the Newtonian limit €) = mc”, Po <« me and lglEy < 1 because the 


CPoy 
momentum of the charge must be small compared to mc. Using these results one 


obtains the well known Newtonian result: 


mqcE ve m Pox 
2Poy Poy 


13.9.2 The Case of a Homogeneous Magnetic Field 


The equation of motion of a charge in a homogeneous magnetic field is: 


ap TAY XB. (13.84) 


Multiplying with p we find: 


d 
oP = 0= p’ = constant. > € = Po + m2c* = constant. 


13.9 Motion of a Charge in a Homogeneous Electromagnetic Field 463 


The speed v = 
of motion as follows: 


2 
pole = constant. Using the constant energy we rewrite the equation 


E 
>=v=qvxB 
c 


and we study the motion parallel and perpendicular to the magnetic field. Integrating 
(13.84) parallel to the magnetic field we find: 


ry = vy (O)t + ry (0). 
The speed”: 
ve =vi +vy => lv (| = constant = |v, (0). 


This result is an additional integral of motion and implies that: 


vi(t) =vi je” 


>It is possible to integrate the equations of motion directly and not use the first integrals. This is 
done as follows. Without restricting the generality we consider the z axis along the direction of the 
magnetic field (B = Bk) in which case the equations of motion in the direction perpendicular to 
the magnetic field give: 


myov, = qBvy Uy = Wry 


myovy = —qBux vy = —wvy 
— gB 
where w@ = aie 


add. It follows: 


To solve this system of simultaneous equations we multiply the second with i and 


(dy + ivy) = —iw (vy + ivy) 
w(t) = w(O)e where w(t) = vy + ivy. 
Equating the real and the imaginary parts we find: 


Ux (t) = vx (0) cos wt + vy (0) sin wt 


Uy (t) = vx (0) sin at — vy (0) cos wt. 
Let us assume for simplicity the initial conditions v,(0) = v,vy(0) = 0. Then ux(t) = 
vcoswt vy,(t) = usinet. Integrating these last relations we find for the motion perpendicular 


to the direction of the magnetic field: 


We 
x(t) = x9 + — sinwt 
ra) 


y(t) = yo a cos wt. 
o 


464 13 The Electromagnetic Field 


where @ is a real parameter. In order to determine w we differentiate v(t) and use 
the equation of motion. We find: 


v_(t) = Vv, (0)we™ = wv, (t) 
therefore: 


qc’ B _ qB 


E 
Zov_(t) =qui)Boo= F 
c E my 


The parameter w we call the cyclotronic frequency and it is known from Newtonian 
Physics. Integrating once more we find the motion perpendicular to the magnetic 
field: 


+r (0). 


ri()= YAO) oo 


From the solution of the equations of motion we conclude that the motion of a 
charge in a constant magnetic field is a combination of two motions: A translational 
motion with constant speed parallel to the magnetic field and a uniform circular 
motion with frequency w in the plane normal to the magnetic field. 

Finally in the Newtonian limit w = 48 


13.9.3 The Case of Two Homogeneous Fields of Equal 
Strength and Perpendicular Directions 


This case concerns the motion of a charge in an electromagnetic field in which the 
electric and the magnetic fields are related as follows £ = B and E- B=0. Without 


restricting generality we consider E = Ej and B = Ek. Then the equation of 
motion 


p=q(E+v~xB) 
becomes: 
q 
Py = sVyE 
c 
v 
Py =qE(1- =) 
c 
pe =0. 
The last equation gives: 


P, = poz = constant. 


13.9 Motion of a Charge in a Homogeneous Electromagnetic Field 465 


In order to solve the remaining two equations we are looking for first integrals. We 
note that the energy € is written as: 


BE? — prc? = pie? + Ct (13.85) 
where Cc; a pe + m?c*. But: 


E=f-v=qEvy = pyc > E — pyc = a = constant 


where in the first step we have used the equation of motion for p,. Combining this 
result with (13.85) we find: 


2.2 2 
“co + C 
Sage (13.86) 
From the last two relations we find by adding and subtracting: 
2.2 2 
a pyc + Cy 
E=-—+4+—__— 13.87 
Gf ae ( ) 
22 2 
a pyo + Cy 
= : 13.88 
m 2c 7 2ac ( ) 
The second equation of motion gives for the quantity p 
: Pxc qE qEa : 
Py =qE(1 : )= 3 (E PS ery = eee 
Replacing € from (13.87) we have: 
2 2 2 
a“+Cr ch 4 
=< dpy = qEat 
|(®5E +S an eer 
and finally: 
C2 C2 
—_ p3 ea, = 
Pyt {it Py = 2qEt + constant. (13.89) 
3a2"> aa}: 


In order to compute the motion (that is the functions x(t), y(t), z(t)) we use 
(13.89) and replace time with the variable py. This is done as follows. 

We consider initially the coordinate x(t) and note that the velocity v, can be 
expressed in two ways. Either as: 


466 13 The Electromagnetic Field 


Or as: 


dx _ dx dpy dx qaE 


dt dpy dt dpy € 


Ux = 


Equating these two expressions we find the differential equation: 


dx pye* c ( a eet) c ( ee 


dpy qaE < qaE 2c 2ac 7 2qE a 


whose solution is: 


c (or Gong 
x(Py) = 2g E 1+ —y ] Py + Py | +20). (13.90) 


Similarly for the y(t) coordinate we compute: 


_ Pye? 
Vy —" E 
dy dy dpy dy qaE 
Vy = => = 
dt apy. dt < <adpy 5E 
from which follows: 

dy on c 2 
—— = ———py SS 0). 13.91 
ae qaE”? = y(Py) Dgak?? + yO) ( ) 


Finally for the z(t) coordinate we have: 


= x(py) Peo? + 2(0) (13.92) 
Py) = qaE Py . . 


dz pac 


dpy  qaE 


13.9.4 The Case of Homogeneous and Parallel Fields E || B 


In this case the electric and the magnetic field are parallel and affect independently 
the motion of the charge, the electric field along the common direction of the 
fields and the magnetic field normal to this direction. Therefore the motion can 
be considered as a combination of two motions: one motion under the action of a 
homogeneous electric field and a second motion under the action of a homogeneous 
magnetic field. These two motions are related via the energy of the charge, which 
depends on its speed (contrary to the Newtonian case). 


13.9 Motion of a Charge in a Homogeneous Electromagnetic Field 467 


Without restricting generality we assume the z axis along the common direction 
of the fields and write E = Ek, B = Bk. The equations of motion are: 


Px = qvyB, py = —quxB, pp = GE. (13.93) 
2 
Because vy = 7 = and similarly for vy, the equations of motion become: 
2 2 
: qBc : qBc : 
Px = —a— Py: Py = ——5— Pas Pe = gE. (13.94) 


The third equation concerns the motion under the action of a homogeneous 
electric field and has been studied in Sect. 13.9.1. The solution is (see (13.81)): 


E 
z(t) = a (coe. + qEt)? +m2c2 + p2, — *) (13.95) 


where poi = Poxi+ poyj and € is the energy of the charge the moment t = 0. We 
also have: 


pz(t) = qEt. 


Differentiating € 2 = p*c* + m’c4 wrt time and using the equations of motion we 
find €€ = pp, from which follows: 


E =| p3c2 + €2 =| (qEct)? + &. (13.96) 
Concerning the solution of the remaining two equations we cannot apply the 
previous study concerning the motion of a charge in a homogeneous magnetic field 


because now the energy of the charge is not constant. From the equations of motion 
we have: 


PxPx + pypy =0 => ae = i = constant = Pi = pe (13.97) 
therefore we have the first integral: 
Px + ipy = pie? (13.98) 


where ¢ is a real parameter. Differentiating (13.98) wrt ¢ and replacing px, py from 
the equations of motion we find for the parameter ¢ the equation: 


2 
o= a ae (13.99) 


é | (qEct)? + € 


468 13 The Electromagnetic Field 


Integrating wrt ft we find eventually (assuming @(0) = 0): 


re E 
ct = ~° sinh ae 
qE Bc 
Using this equation we can change the variable ¢ with ¢ facilitating the integration 
of the equations of motion. Indeed we have: 


a E d(x tty) _ E dg d(x + iy) 
Px +ipy = a2 Us tity) = G dt 2 di do 


E qgBc- d(x + iy) d(x + iy) 
=qB 2 
e€ do do 


Integrating wrt ¢: 
Sipe [eitae = i+ (e-# — 1) 4+ x(0) + iy) 
qB Jo qB 
from which follows: 
PL 
x(o) = — ning +x(0), y@) = ae —1)+ y(0). 


Concerning the z(t) coordinate we find in a similar fashion: 


E. Ed¢dz dz &o ., bE 
a = sinh 


Et = > = : 
3 @dtdp de qBc Be 


Integrating: 
gE 
z(¢) = “(cosh ie i) + z(0). 


The orbit of the charge is a helix with constant radius a and a pace which 


2 
increases monotonically. The angular speed of rotation equals de = qe and 


diminishes with time whereas the speed of the translational motion along the 
common direction of the fields increases continually with limiting value c. 


13.10 The Relativistic Electric and Magnetic Fields 


In the previous sections we developed the theory of the electromagnetic field using 
the 3-vector notation. This formulation, although it is more tangible to the new 
comer in Special Relativity, lacks the consistency and the power of the four-vector 


13.10 The Relativistic Electric and Magnetic Fields 469 


formalism. Furthermore it cannot be used in General Relativity in which the four- 
formalism is a must. For these reasons in the present and the following sections 
we discuss the theory of the electromagnetic field using the four-formalism. In 
particular we consider the case of a homogeneous and isotropic material, the empty 
space being the extreme particular case. Although it is not necessary, we shall keep 
c in the formulae in order to make them applicable to numerical calculations. This 
means that the four-velocity of a comoving observer is u“ = cd9 (Ua = —cé°), All 
frames are assumed to be Lorentz orthonormal frames (i.e. LCF) so that the metric 
nab = (1, 1,1, 1). 


13.10.1 The Levi Civita Tensor Density 


In the following we shall use the antisymmetric tensor F,, hence it is necessary that 
we shall discuss concisely the basic tool in the manipulation of such tensors, that is 
the Levi Civita tensor density. 

In Minkowski space the alternating tensor is defined as follows: 


genet = ylabed] | 40? — (er? — —10123 (13.100) 


where g = det(ggp). It has the properties: 


adel ree = — 3151452541 
Eas (54,5254 = 82.823? 4 52525 — 3? 3254 4 34 ata? — 64,5208 ) 
(13.101) 
n° napse = —451°64] = —2(8¢54 — 846°). (13.102) 
04 Naber = 3160 (13.103) 
nf?" nabcd = —4l. (13.104) 
In the Euclidian 3-d space the alternating tensor is defined as follows: 
hve = level, 713 =p; i/2 
where h = det(h,,)) and satisfies the properties: 
ne nua = 26681, YP Quve = 20°. (13.105) 


Usually in Euclidian space 7,9 is written as €,,,, because the determinant of the 
Euclidian metric equals 1 therefore the tensor density becomes a tensor. However 
we should not worry about that and we shall keep the notation n,,,, for uniformity. 


470 13 The Electromagnetic Field 


Examples 


1. Consider the Euclidian vectors u = u“é,,, v = v“é,, where {@} is an orthonormal 
basis e.g. the {i, j, k}. Then the cross product: 


Ux v= 1) UU ey or (ux v)K = nf uydy. 
2. Let us prove the well known identity of vector calculus: 
A x (B x C) = (A: COB —- (A - B)C 
We have: 


[A x (B x C)]* = n° A, (B x C), _ nh”? Aynpor B° C* 
= ne nog Ay B? C* 
= (603? - 5 6F)A, BOC = (A,yC”)B" — (A, B”)C" 
= [(A- C)B— (A- B)C}’. 


3. For every antisymmetric tensor A,» = 5 (A pv Avy) we define the vector: 


1 
RES 5m” Avy. (13.106) 
Conversely for every vector R“ we define the antisymmetric tensor: 
Aw = Nuvp R?. (13.107) 


4. The curl of a vector field B“ is the antisymmetric tensor A, = Biy,vy. 


In Minkowski space the above remain the same except that we have more terms 
and appropriate changes of sign. The relation which we shall use frequently is the 
definition of a vector from an antisymmetric tensor: 


1 
wo! = 5a open (13.108) 


where w@ is the four-velocity. The four-vector w“ is spacelike, that is @“ug = 0. 
The inverse relation is: 


ab = Nabed ou" (13.109) 
These are the basics one should know in order to be able to follow the subsequent 


calculations. It is strongly advised that the reader reads and understands these 
concepts properly before he/she continues. 


13.10 The Relativistic Electric and Magnetic Fields 471 
13.10.2 The Case of Vacuum 


As we have seen the electromagnetic field in vacuum for a given LCF & is described 
by the 3-electric and the 3-magnetic fields E, B (in &). These fields are the 
components of the electromagnetic field tensor F,, in &. The tensor Fy», although 
determined in &, is independent of & and characterizes only the electromagnetic 
field. This means that the vector fields E, B essentially require two tensors to 
be defined, i.e. the tensor Fy, and the (relativistic) observer with four-velocity 
u® observing the electromagnetic field (that is the proper observer of X). Using 
this observation we introduce a (relativistic) electric and a (relativistic) magnetic 
induction field E“%, B@ respectively by the following relations: 


1 
Eq = Fant’, Ba = =-Nabea Fu! (13.110) 
where 7“? is the Levi—Civita density. For reasons which we shall explain soon 


we name these new four-vectors the four-electric field and the four-magnetic 
induction field. The following relations are obvious: 


E,u’ = Bu’ =0 (13.111) 
that is, the four-vectors E“, B% are spacelike. 


Let us compute the components of E“, B“ in the proper frame of the observer, 
x7 say, in which we assume that the electromagnetic field tensor is given by: 


0 —Et/c =f, /¢ -E}/c 
Er 0 BY — BY 
Ala | "i "Om i 
EY /e — B 0 By 
Bi je By — BY 0 — 
while u“ = ) . Replacing these coordinate expressions in (13.110) we 
yt 


compute easily~°: 


?6Tt will help if we demonstrate the computation of the components of the four-vector for the 
magnetic field. For example for the B, coordinate we have: 


! ik 1 __ | jk 1 ig: — ay :¢ 
By = Fo iY = Fee Majeh? a 5 Mxye (FY — FF?) =F = iE 


472 13 The Electromagnetic Field 


What is the difference among the fields E“, B“ and E, B ? The difference is that 
they coincide only in the proper frame ©* of the observer while in the frame of 
any other LCF & the E“, B® are computed by the proper Lorentz transformation 
relating ©, &* while the E,B are computed from the transformed components of 
the tensor F;; via the relations: 


Fo — Fy 
E=c Fo2 ’ B= — F3 . (13.112) 
Fo3 —Fi2 


We emphasize that the fields E“, B“ depend on the observer u' while the tensor 
Fp is the same for all observers and characterizes the electromagnetic field only. 

We consider the LCF © in which there exists the electromagnetic field Fay given 
by the following matrix: 


0 —Ex/c —Ey/c  —E;/c 
Ex/c 0 B, — By 
F.j=— ‘ 13.113 
Oe. . (13.113) 
E./c By — By 0 


= 
In order to compute the components of the electric field observed by an observer u% 


whose four-velocity in © is u4 = (% ) we multiply the matrices: 
»» 


[E*] = [Fao ][u? I. 


from which follows: 


7 E-v 
E’= 2 ee xB)] . (13.114) 
x 


Exercise 13.10.1. Show that the components of the magnetic field B® in X are as 
follows: 


a B-v 1 
Bo= ¥—,y(B-—vx E) F (13.115) 
Cc c > 


Are these results compatible with the ones we found in (13.55) and (13.56)? 


It is possible to express the tensor Fz, in terms of the vector fields E“, H“. The 
simplest way to do this is to consider the fields E“, B“ in the proper frame of the 
observer u! and show that: 


1 1 
Fab = ——y(Eatty — Epa) + Mave Bou". (13.116) 


13.10 The Relativistic Electric and Magnetic Fields 473 


This equation although proved in one frame holds in all frames because it is a 
tensor equation. 


Exercise 13.10.2 Verify (13.116) by replacing Fj; in (13.110). 


The four-force F“ on a four-current j“ which is moving under the action of the 
electromagnetic field Fz, is given by the expression (see Sect. 13.8): 


1 % 
Fg = — ab] - (13.117) 
c 
Hence the equation of motion of a charge of mass m is: 


d 4 1 b 
—(mu") = — Favj (13.118) 
dt c 

where u“ is the four-velocity of the proper observer of the charge and t is the proper 
time of that observer. 


Y 


Exercise 13.10.3. An observer with four velocity uf = ( 
yv 


) observes the 
x 


yE-v 
y (E+v xB) 
charge q in & is as follows: 


electric field E® = ( ) . Show that the equation of motion of a 
x 


d 
yy = tyB-ve =f yE.y (13.119) 
m T m 


d 
(YY) _ 4, @4yxB) (13.120) 
dt m 


wetya= 1) +vxBe 
m 
where t is the proper time of the charge. 
Hint: Use that the four-acceleration of the observer in X is given by at = 


( vy ) where y = %, a= © and t is time in &. 
we tyal/s 


In Sect. 13.6.2 we have seen that Maxwell equations in vacuum in terms of the 
electromagnetic field tensor Fz, are written as follows: 


Frab,c} =0 (13.121) 
F? , = —poj* (13.122) 


where {ab, c} means cyclic sum in all enclosed indices. Furthermore the continuity 
equation for the charge is expressed by the equation: 


jg =9. (13.123) 


The above equations are the basic equations of the electromagnetic field for the 
vacuum in four-formalism. 


474 13 The Electromagnetic Field 
13.10.3 The Electromagnetic Theory for a General Medium 


As it has been said at the beginning of this chapter the electromagnetic field in a 
LCF »& for a general medium is specified by the following four 3-vectors: 


. Electric field E 

. Magnetic field B 

. Electric displacement field D 

. Magnetic displacement field H 


BRWN Re 


As the vectors E, B define in & the antisymmetric tensor Fy, (13.113), in the 
same manner the fields D, H define in & the antisymmetric tensor Kap as follows: 


0 =De =Dy -=D,; 
Dy, O H,/c — Hy/c 
Kap = * 7 13.124 
ae | Dy =Hyfe 0 H,/c ( ) 
D, Hy/c —H,/c 0 5 


Therefore in a general medium the electromagnetic field is expressed by two 
different (antisymmetric) tensors, the electromagnetic field tensor F,, and the 
induction field tensor K,,. This means that for a general material an observer with 
four-velocity u defines the following four spacelike four-vectors: 


1. The electric field: 


E¢ = F® y, (13.125) 
2. The electric displacement field: 
a 1 ab 
Df = -K“up (13.126) 
c 
3. The magnetic induction field: 
1 
B= 5, Mabea Fu! (13.127) 
c 


4. The magnetic displacement field: 


1 
H, = 5 Mabea Ku! (13.128) 
These four-vector fields express the ‘interaction’ of the observer with the 
electromagnetic field and they are in general different for the same electromagnetic 
field and different observers. In order to compute the components of these fields in a 


13.10 The Relativistic Electric and Magnetic Fields 475 


LCF © in which the four-velocity of the observes u“ = i ) we use (13.124) 
v 


and multiply with the matrix representation of u*. For example the components of 
the electric induction D% in © are computed from the multiplication of the matrices: 


[D*] = [Kallu?]. 


Replacing we find: 
4 D-.v 1 
D° =| y—.y(D+5vxH)] . (13.129) 
Cc Cc >) 


Exercise 13.10.4 Show that the components of the magnetic displacement field H 
in & are as follows: 


H-v 
H = (VS. y(H—vx D) : (13.130) 
c z 


Relations (13.125), (13.126), (13.127) and (13.128) can be inverted to express 
Fay and Kap in terms of the vector fields E“, B“, D“, H“ as follows: 


1 1 
Fab =  (—Eatty + Epua) + —Mabed Boul (13.131) 
1 1 cd 
Kap = é (—Daup + Dptta) + e2Mabed H u. (13.132) 
Exercise 13.10.5 Prove (13.132) by replacing the expressions of Kap in (13.128). 


Hint: See Exercise 13.10.2. 


In the literature use is made of the dual form of the tensors Fy, and Kap by 
applying the theory of bivectors. A bivector is any second order antisymmetric 
tensor Xgp = —Xbq, A bivector is called simple if it can be written in the form 
Xab = Aja Bp, where A%, B® are vectors. 

The dual bivector X*@ of a bivector Xap is defined as follows: 


1 1 
yrab ‘ns 50 Xed S xe = St Xa (13.133) 


It is easy to show that in the LCF & the components of the dual bivectors 
preab K xa are: 


0 D, Dy D; 
. Dy O E./c  —Ey/c 
= : 13.134 
Foo Dy —E,/c 0 E,/c ( ) 
D, Ey/c —E,;/c 0 z 


476 13 The Electromagnetic Field 


0 —HAy,/c —Hy/c —H;/c 
* H,./c 0 D, — Dy 
= ‘ 13.1 
eb Hy/c —D, 0 Dy (p82) 
Hz/c Dy — D, ) > 


Exercise 13.10.6 Show that in the comoving frame the magnetic induction and the 
magnetic displacement field are given by the relation: 


0 0 

B H. 
Fi? =| 7* ; Rye ae 

By Hy 

Be} 54 ALS 54+ 


Then prove that the dual bivectors F*,, K;), in terms of the vector fields of the 
electromagnetic field are written as follows: 


x ! ! 1 cd 
Fup = —— Batty + — Bolla — > Nabed E u (13.136) 
c Cc Cc 
x I ! ! cd 
Kip = ——= Haup + = Abua — =Nabcd Du". (13.137) 
Cc c Cc 


Exercise 13.10.7 Prove the following formulae: 


1 
Bo =—F* yy, Ha = Ke up (13.138) 
Cc 
1 xbc, d 1 «bc, d 
Eq = ~ 5 Mabe F u“, Da = — 5, Nabe K u. (13.139) 


13.10.4 The Electric and Magnetic Moments 


Consider an electromagnetic field which is described by the tensors Fyp, Kap. We 
define the magnetization tensor: 


Eg 
Mi See 2 aR (13.140) 
Lo 


If uv“ is the four-velocity of an observer we define the four-vectors: 


Pq 


1 b 
— Mapu (13.141) 
Cc 


1 
= Nabed Mu". (13.142) 


Ma 5 


13.10 The Relativistic Electric and Magnetic Fields 477 


P® is called the polarization four-vector and M, the magnetization four-vector. 
Exercise 13.10.8 


a. Show that for a general medium: 


BP, = Dy = eE; (13.143) 
1 

Mo= HB, (13.144) 
Lo 


Deduce that in empty space P* = M*% = 0, hence the polarization and the 
magnetization fields measure properties of matter. 

b. Show that in a homogeneous and isotropic medium the polarization and the 
magnetization vector are given by the following formulae: 


P* = (e9 — e)E* (13.145) 
a (eg a 
M? = (4 = 1) H", (13.146) 
Lo 


c. Making use of (13.114), (13.115), (13.129) and (13.130) show that: 


Pl= (Coes Y (w eoE) Lyx (H as »)) (13.147) 
7 c Ho z 


4 (H— B)-v B 
M¢=(y : (i a vx (D cob) ‘ (13.148) 


x 


13.10.5 Maxwell Equations for a General Medium 


For a general medium Maxwell equations take the form: 
| ee (13.149) 
pe 0 (13.150) 


where J“ is the four-current density and F*@? = is the dual bivector of 


1 abcd fred 
or 
the electromagnetic field tensor F@”. In the following we consider c = 1. 


Exercise 13.10.9 Show that equations (13.149) and (13.150) reduce to the 3-d 
Maxwell equations (13.1), (13.2) and (13.3) in the proper frame of the material. 
More specifically show that: 

Exercise 13.10.10 Prove that the field equation (13.150) can be written as follows: 


Pe = 0S Fone = 0. (13.151) 


478 13 The Electromagnetic Field 


F asa =0_ reduces to the equations divB=0, curlE = — 3B 


K - pod “ reduces to the equations divD=p, culIH=J+ 2. 


where {ab; c} means cyclic sum in all indices enclosed. 
Hint: pe =05 794 F.g.4 = 0. Multiply with nars: and expand the product 


Narsin@’4 to get the result. The inverse is obvious. 


The four-current by means of the four-velocity u“ is decomposed as follows: 
1 
J = = pu" + j* (13.152) 
c 
where: 
p=—u'sg, jr =hes? (13.153) 


p is the charge density and j“ is the conduction current. The current pu“ we call 
the convection current. More on the 1+3 decomposition of the conduction current 
we discuss in Sect. 13.11. 
The antisymmetry of K@? implies K ab ab = 9 which leads to the continuity 
equation for the charge: 
Jf =0. (13.154) 


3a 


13.10.6 The 1+3 Decomposition of Maxwell Equations 


In order to write Maxwell equations in terms of the various vector fields we 
must decompose them wrt the four-velocity u* of the observer who observes the 
electromagnetic field. These equations are the relativistic covariant generalization 
of the 3-d Newtonian equations and consist of two sets of equations, called the 
constraint and the propagation equations for the electromagnetic field. 

Before we proceed with the computation of these equations we need to recall 
the 1+3 decomposition of the tensor u;.;. This is given by the identity (not 
equation!)~’: 


27We use semicolon (;) to indicate the partial derivative and not comma (,) as we do for the rest of 
the book. The reason is that the results we derive hold also in General Relativity where we have the 
Riemannian covariant derivative, which is indicated with semicolon. These results are important 
therefore we see no reason for not giving them in all their generality. If the reader finds it hard to 
follow the formalism he/she can replace semicolon with comma and all results go through without 
any change. 


13.10 The Relativistic Electric and Magnetic Fields 479 
1 fg 
Uj; j = Wij + Oj + Oni — ay mikM uj (13.155) 


where the (symmetric) tensor hjj = nj + uju is the standard tensor which 


projects perpendicular to u‘, that is h; jul = 0 and the various tensors involved are 
defined as follows: 


, 1... I 
wo = sn Nu jeer 0= We, O77 = UG: j) — Pau (13.156) 


Proposition 13.10.1 The 1+3 decomposition of the equation F mee » = O gives the 
equations: 


hgB? , = 207 Eg (13.157) 


; 2 
hgB? = u",,B’ — 0B — I*(E) = (04, + 0%, — =n", ) B® — I*(E) 


(13.158) 
where the ‘current’: 
I9(E) = 0° up (tc Ea — Ec;a) (13.159) 
and a dot over a symbol denotes covariant differentiation wrt u“. 
The 1+3 decomposition of the equation K ap = J® gives the equations: 
gD? = —-q + 20H, (13.160) 


: 2 
hg D? = u*,,D? — 6D* + I*(H) — j* = (0%, + 0%, — on", ) D? +1*(H) — j? 
(13.161) 
where the ‘current’: 
1°) = °° up (teHa — He-a). (13.162) 
Proof We make extensive use the formulae (/3./38) and (/3.139). 
Equation F*“’ »=0 
u“ —projection 


Ua ‘ae = 0. 


We compute: 


1 ‘ 
tg Fe —_ (uaF**”) +b epee _ —cB’, a Fr ey + Pe ttauty 


480 13 The Electromagnetic Field 


1 
= —cB?., — F** way + c= BY ita. (13.163) 
. Cc 


The vorticity vector w“ is defined by the relation: 


1 
of = sat Upc ? Wab = Nabcaw’ u4 
This gives: 
Fe? ogy = —™ nabed Epus@o ut = 2 (5255 — 6 A) E,-uso°ut = —2c* E,w* 


Also the term: 
b 1 b 1 b a,b 1 a pb 
1 7 Billa = BY, +35 = Buy = B (Nab + — ually) = h,B., 
Replacing in (/3./63) we find the constraint equation for B“: 
hiB? = 0 E 13.164 
bP wa = eo" a: (13. ) 
Spatial Projection 
b 
hy F* “ = 0. 
We compute: 
a yx*be 1 a iad 
hy F :¢ = ar bY Fea: e 
1 
= sh on [2 Etta) :e _ Nedrs(U" B*).¢] 
_ a _bcde a _bcde r Ss r ps 
= —2h bl (Ec:eUd aa E.ud:e) —f pn Nedrs(Uu Bo +uB se). 
The first two terms give: 


At = —2h* yn?" (Eeettd oP Ecua:e) = 
= Se) ae wae OP _ 2h? pn?” E-(wae _ lide) 
= SOO" Ec ctha -_ 2h? yn?" E-wde a 2h? pn?" Ecitdue. 


= 27? E.. guy — 2? upc Eg = —2n°“ up (ic Ea — Ec-a). 


13.10 The Relativistic Electric and Magnetic Fields 481 


The term: 
2h" 4 Ec(—2) (625° — 5£5")u" ws = 0 
therefore: 
AY = 24 Ee. guy — 2 pit Eg = — 29 up (te Ea — Eeza)- 


The last term gives: 
C8 = hy Nears(U" eB + U" BY) = 
= 2h" 4 (8736 — 5€8)(u" .eB* + u" Be) 
= 2h, (u?. Be < u? Be. _ uw" B° - u’B?..) 
= 2h,(0"..+.0".6+ soe =a JR’ 2089 Oh Be 
=2(0%, +0", — 50% eBS —Dh* RB’. 
Combining the last two results we have the propagation equation for B@: 
h®,B? = (0%, + 0%, — Oh .)BS + I“(E) (13.165) 
where the E—current is given by the relation: 
UE) Si agi kg = EBay). (13.166) 


Equation K ap =F 
Uu projection 


wa = 
We compute: 
Wake"; a (uak®) | = uUg.pK® = 
= Dd’, — (ah — Ugly) K@ = pis ee Gennes 
The term: 


K” wap = K* > nabed ut = —2H ap. 


482 13 The Electromagnetic Field 


Therefore we have the constraint equation for D® : 
D6. — D tig = —p — 20° Ag. (13.167) 
Spatial projection 
n%4K©.. = hy, J? = j*. 
The lhs of the equation gives: 
h@,[(DPuo + Dou?).¢ — 1°" (ug He) ;c] = 
=H); |-2° SPD a ePal a Gas wa He:e) | 
S97. D? = Dui De (0°. + sh + 0%, — iu) — hyn?" wacHe 
4 1" tau He _ nf" 4g Hoe 
=—h",D’ + D¢ (0°. +04. - =n.) — n°°@-H*® — u220° H, 
+ ee uctig He — OP uy Hera: 
The term: 


Abed etl” aa Ha 


= 2 (5454 — 8484) = 2u%w? Hp. 


‘d 
—n** °wacHe = HO opel =) 


Replacing in the last relation we find the propagation equation for D@ : 


2 
hg D? = u*.,D? —0D* + I°(H) — j* = (o* +04. — =n.) Do + 1°(H) — j4 


(13.168) 

where the H—current is defined as follows: 
1°) = °° uy (teHa — He-a). (13.169) 
oO 


Equations (13.157) and (13.160) are called the constraint equations (because 
they contain divergences) and equations (13.158) and (13.161) the propagation 
equations (because they express the derivative of the fields along u“) for the electric 
and the magnetic field. 


13.11 The Four-Current of Conductivity and Ohm’s Law 483 
13.11 The Four-Current of Conductivity and Ohm’s Law 


We consider a charge g moving with four-velocity u° in a region of space where 
there exists an electromagnetic field F,,. The charge “sees” an electric field E“ and 
a magnetic field B® given by (13.110): 


E¢ = F4,u? (13.170) 
1 
Bl= - Py Fed (13.171) 


while the tensor F,, in terms of the fields E%, B® and u“ is given by relation 
(13.117): 


1 1 
Fab = aE — UpEa) —Nabeau® BY. (13.172) 


Finally we recall the equation of continuity for charge: 


Jt =0. (13.173) 


3a 


In order to study the four-vector J“ we 1+3 decompose it along the four-velocity 
u“ of the charge and then we give a physical meaning to each irreducible part. From 
the standard analysis of 1+3 decomposition we have: 


Jt = —(JPuy)u? + ngs? (13.174) 


where hyp is the tensor projecting perpendicularly to uv“. It follows that in every 
LCF the four-current J“ determines and it is determined by one invariant (the 
J°uy) and one spacelike vector (the h@ pt >). Consequently in order to give a 
physical interpretation to the four current J“ we must give an interpretation to 
the invariant J?u, and the spacelike vector h?,J > As we have emphasized the 
physical interpretation of a relativistic quantity is made by its identification with a 
corresponding Newtonian physical quantity in the proper frame of the observer. We 
consider the following interpretation/identification: 


e The invariant J“u, is the charge density of the charge in the proper frame of the 
charge 

¢ The spacelike four-vector hf J » is the 3-vector of electrical conductivity in the 
proper frame of the charge. 


Let us compute the components of the four-current J“ in a LCF & in which the 


four-velocity u% = (“<) . We write: 
x 


yv 
3 
Jt= ¢ ) (13.175) 
x 


484 13 The Electromagnetic Field 


where the components j° and j have to be computed. From the definition of the 
tensor Hap, we have: 


¢ 


1 Jou 
hag” = (Nap + Uap) J” = Jg + _ 
Cc Cc 


Ua 


0: 1 . 7 
=(-j. pst acre" +yj-v)(-ye, yv)s 


. 2 2 
iv. Po We 
=(0 jo -y?—,j- —)°v+ ty) 
(& Cc Cc >» 
* 2 2 
Pte Yar. 
= (8¥73" y—,j-—jv+ >G-¥) 
Cc Cc Cc x 


Here we have exhausted all our assumptions. Essentially all equations we have 
obtained up to now are mathematical identities because we have not used any 
physical law. In order to continue we consider Ohm’s Law which is stated as follows: 


hap J” = kapF?.ue (13.176) 


kap is a tensor which we call the electric conductivity tensor. For a homogeneous 
and isotropic medium the tensor kgp = kdgp where k is a constant called the electric 
conductivity of the medium. Subsequently for such a medium Ohm’s Law reads: 


AJ? =kF4ue = kE*. (13.177) 
Ohm’s Law makes possible the calculation of the charge density and the conduction 


current in X. Indeed replacing in (13.177) the components in & of the various 
quantities involved we find: 


E-v PS a Moai Vs 
(y=. yvE+vxB) =i *y? PP - yr, j Pvt SG-vv) . 
c x c c c x 
(13.178) 
The zeroth component gives: 
i. kE.- 
MRD, 7 Pens cues (13.179) 
c yoc 


and the spatial component: 


2 
poe (1°- 26-0) v4 y+ v xB). (13.180) 
Cc Cc 


13.11 The Four-Current of Conductivity and Ohm’s Law 485 


We note that the zeroth component follows from the spatial part if the latter is scalar 
multiplied with v, hence the zeroth component is included in the spatial part and we 
ignore it. 

We continue with the spatial part of the four-vector. We note that the quantity in 
parenthesis gives: 


2 2 
y(. 1, yo (. ; kE-v tf: E-v 
(°-t6-») = (7 p°j° i jo-ky 
c c c yc c c 


Replacing in (13.180) we find: 


. lo E-v 
j=-j vtky |E+vxB- v}. (13.181) 


76 ce 
We conclude that in a charged homogeneous and isotropic medium the 3-current 
jinaLCF & has two parts: 


1. The convection current: 
F lo 
Jconv = a v (13.182) 


which depends on the motion of the charged medium and it is due to the charge 
density in X. 
2. The conduction current: 


: E-v 
Jcond = ky (E+vx B- —o.¥ (13.183) 
Cc 


which is due to the fields E, B in & and depends on the electric conductivity k 
of the charged medium. 


The physical meaning of this result is that if a conductor is not charged (that is 
J“uq = 0) and moves in a LCF with velocity v, it appears to be charged with charge 
Ex. (More on this topic can be found in Exercise 13.4.2). 

We compute next the components of the four-vector J“ in & in terms of the 


zeroth component j” in D.78 From the previous considerations we have: 


Jug = —ycj° +yj-v 


é ' kE-v 
=ye (= + 6279+ <=") 
ye 


C9 
=——j +kE-y. 
Y 


?8Note that the components are not the same with the coefficients of the 1+3 decomposition 
because the former are just components whereas the second tensors. 


486 13 The Electromagnetic Field 


Therefore the component j° of the four-vector J“ in D is: 


E. 
= (13.184) 


Cc 


j? = © (—J*ua) + ky 


Concerning the spatial part of the four-current J“ in & we have from (13.181): 


j= -j vtky |E+vxB-—v 
c c 


1/6 E-v 
—{j —ky— ]v+ky(E+v~x B) 
c c 
= 2 (-s4ua) v + ky (E+ v x B). 
Cc 


The conduction current can be written in an alternative form which helps us to 
understand its physical meaning. We project (13.183) along and perpendicularly to 
the 3-velocity vector v of the charged medium in & and find: 


E. E. 
ky (E+vxB-="v) v= ty(E-v- =v) 
c c 
k kv kv_, 
=> —E-v= —E, => —E, 
Y Y Y 


and 
ky (E,+v x B) = kyE', 


that is, the two currents are expressed in terms of the corresponding parts of the 
electric field in the proper frame of the charged medium. We conclude that the 
conduction current in © is: 


: kv_, j 
Jcond = yl + kyE,. (13.185) 


From this relation becomes clear that the conduction current is due to the electro- 
magnetic field and not to the charge density of the medium. 


Example 13.11.1 A charged isotropic and homogeneous medium with electric 
conductivity k moves in a LCF © with velocity u. In “Newtonian ” electrodynamics 
Ohm’s Law states that j = kE, where j is the conduction current in © and E the 
electric field in X&. Assuming that the scalar quantity k is Lorentz invariant prove 
that the Lorentz covariant form of Ohm’s Law in Special Relativity is hgpJ >=—kE, 
where: 


13.11 The Four-Current of Conductivity and Ohm’s Law 487 


a 


a. u“ is the four-velocity of the charge defined by the 3-velocity u in &, that is 
ul = () ; 
YV/ > 


b. J“ is a four-vector with components J“ = () where p is the charge density 
x 


in X. 
c. E¢ =F vue is the relativistic electric field which corresponds to the four-velocity 


u“. 


Solution 
It is enough to prove that the covariant expression holds in one LCF and 
specifically in the proper frame £7 of the charged medium. In this frame the four- 


velocity u? = (5) and the electric field E¢ = Fup = ( 3 . Suppose 
O/ 54 ay pee 


ot 
J 
quantities in terms of their components and find jf = kE* which coincides with 
the “Newtonian” j = kE written in DY. 


ere c 3 : : : 
that in this frame J* = ( ) . We replace in the given expression the various 
yt 


13.11.1 The Continuity Equation J°,q = 0 for an Isotropic 
Material 


The equation J“., = 0 is a direct consequence of Maxwell equations therefore 
it does not give information which is not already included in Maxwell equations. 
However it can be used in order to express the electric conductivity through Maxwell 


equations. 
We have: 
J%.g = (pou? + kE*).g =0 (13.186) 
from which follows: 
bo + pot;a + kaE* + KE*.q = 0 (13.187) 
where (9 = /,qu“ and we have assumed that the material is isotropic (but 


homogeneous!) so that k need not be constant. We compute the divergence of the 
field E“. We have: 


Bey = (Fe ay) .g SFO tip Fg. (13.188) 


488 13 The Electromagnetic Field 


But Maxwell equations give: 
F. guy = —UJ° up = Upo. (13.189) 


Also from the decomposition (13.155) of ug:» and the antisymmetry of F“? we 
have: 


Fe uy.g = FY (wpa — tpg) = Faq + Up E?. (13.190) 
Replacing in (13.188) we find: 
E*., = upo — FY way + tyE”. (13.191) 


Finally the conservation equation (13.187) gives: 
k gE" +k (upo - FY way + tia E") + po + p09 = 0 (13.192) 


Equation (13.192) holds for a general velocity and a general electromagnetic field. 
It is a differential equation which determines the electric conductivity along the 
direction of the field E%. 


13.12 The Electromagnetic Field in a Homogeneous 
and Isotropic Medium 


The fields E“, B“, D®, H® and equations (13.157), (13.158), (13.160) and (13.161) 
describe the electromagnetic field in a general medium. Since it is impossible 
to solve these equations for an arbitrary medium we consider special cases by 
assuming relations amongst these fields. These relations we call constitutive 
relations. One such relation we used at the beginning of this chapter to define (in 
the rest frame of the material only!) the homogeneous and isotropic material by the 
requirements: 


D =cE, B= nH (13.193) 


where the coefficients ¢, 4 are constants, characteristic of the material, and satisfy 
the relation: 


eu= — (13.194) 


where v is the (group) velocity of the electromagnetic field within the material. For 
vacuum these coefficients are the €9, zo, and satisfy the relation (13.9): 


13.12 The Electromagnetic Field in a Homogeneous and Isotropic Medium 489 


1 
e090 = ms (13.195) 


The refraction index n of the material is defined by the ratio: 
c 
n=-—>1 (13.196) 
v 


and it is given by the relation: 


n= _,./—=c/ée. (13.197) 
E00 


The index of refraction is the physical quantity which differentiates a homogeneous 
and isotropic material from the vacuum (the extreme such material). 

This definition of the homogeneous and isotropic material is not suitable in 
Special Relativity because it is given in the proper frame only and in terms of 3- 
vectors in that frame. The proper way to state a constitutive relation in relativity 
is in terms of the four-vectors E“, B“, D“, H“%. Indeed in that case this relation 
is covariant hence observer independent. Following this remark we define a (rela- 
tivistic) homogeneous and isotropic material as one in which the electromagnetic 
field vectors E“, B“, D“, H® which are assigned by some observers u“, satisfy the 
following requirements: 


D'=cE’, Bl=pnH*, J¢=kH4 (13.198) 


where the constants ¢, jz are the dielectric constant and the magnetic permeability 
of the material respectively, and k is the electric conductivity of the material. 
Example 13.12.1 


i. Show that condition (/3.198) implies the following conditions on the tensors 
Fab, Kap of the electromagnetic field: 


1 


—Kapu? = ¢ Fapu? (13.199) 
Cc 
1 
UK apc} = a Fable. (13.200) 
ii. Define the tensor: 

il x x 
€abcd = —~ \Nac — —z4ale } \Mbd — “yz 4bUd (13.201) 

[uc Cc Cc 


where xy = n? — | is the electric susceptibility of the material and n is the 
refraction index of the material. Show that for a homogeneous and isotropic 


490 13 The Electromagnetic Field 


material the induction tensor Kap is related to the electromagnetic field tensor 
F@ as follows: 


Kab = tabeaF™. (13.202) 


1 x x 
rane (Nac — Z uate) (no — Z upua) oe 
[uc Cc Cc 
1 
= ue" + Zab") (no = Zupta) 
c 


Fap + 7a = ZupEs ) 


1 1 XxX x 
at —Equpy + Epug) + —Mabed Boul 7 cataky _ Eusks| 


Il 
a 


1 a 
al + xX) (—Equy + Epug) + H nate 


1 
i rie Equy + Epua) + aa Nabea H ud 
1 ea 
= . (—Dgupy + Dota) + e2Mabea H u 
= Kap. 


oO 


In order to express (13.198) in terms of the 3-vector fields we consider a LCF 
=X in which the four-vectors E“, B“, D“, H® are given by (13.114), (13.115) and 
(13.129), (13.130) respectively and find: 


1 
D+ 47 x H=e(E+ v x B) (13.203) 
c 
1 
u(H — v x D) = B- vx E. (13.204) 
c 
Replacing B into the first equation we find: 


1 1 
D+—v x H=cE+env x H—-epv x (v x D)+—ev x (v x E). (13.205) 
c Cc 


13.12 The Electromagnetic Field in a Homogeneous and Isotropic Medium 491 


Using the identity of vector calculus: 
A x (Bx C) = (A. C)B—- (A - B)C 
and relations (13.194) and (13.195) we end up with the conditions: 


D, = 3 On (13.206) 


n2—1 


(1 —n*p*)D, = man 4. vxH. (13.207) 


C2 
We see that in & condition (13.193) does not hold for a homogeneous and 
isotropic material (except ifn = 1 i.e. in empty space!). 


Exercise 13.12.1 Show that for a homogeneous and isotropic material: 


By = “HH (13.208) 
2 92 a n—1 
(1 —n?6)B, = “FH + —Z-VxE (13.209) 


For 8 < 1 the above expressions (13.206), (13.207) and (13.208), (13.209) 
reduce to: 


n2—1 


D=cE, + vxH (13.210) 


c2 
n2 


1 
B=H + vxE. (13.211) 


c2 


These relations are used widely in the study of the electromagnetic field in a 
homogeneous and isotropic medium. 


Exercise 13.12.2. Show that the polarization four-vector and the magnetization 
four-vector under a Lorentz transformation transform as follows: 


P,=P°, PL= pot M? 13.212 
y=Py PLr=y a (13.212) 
My =M?, Mi =y (M?-v x P°) (13.213) 


where P°, M° are these vectors in the proper frame X* of the medium and u“ = 
te is the four-velocity in &*. Deduce that a medium which is polarized but 
YV/ 5 


not magnetized in one LCF it is polarized and magnetized for another LCF. 


492 13 The Electromagnetic Field 


For small velocities 8 — 0 relations (13.212) and (13.213) give: 
P=P?, M=P"xy (13.214) 


which means that polarized material appears magnetized. 
Let us consider a magnet resting in ©* so that P? = 0, M° ¥ 0. Then in another 
LCF © (13.212) and (13.213) give: 


1 
P=-y—>vxM°, M=M’. (13.215) 
Cc 


This means that a moving permanent magnet carries an electric moment giving 
rise to the phenomenon of homopolar induction utilized widely in electrical 
engineering. 

There remain the propagation and the constraint equations for a homogeneous 
and isotropic material. 

From the corresponding equations for a general medium i.e. equations (13.157), 
(13.158), (13.160) and (13.161) follows: 


1 
h?,H?, = —20"E, (13.216) 
ae 
1 
h’, E>, = = + =20." Ha (13.217) 
a yb a b a 1 a a a 2 a b ) a 
h",H° =u".,H’ —0H =e (E) = of, +o, — 30h", H ae (E) 
(13.218) 
a pb a b a 1 a lia 
ht ,E? =u", BE” —@E* + —I*(H) — -j 
‘ é E 
=(o44+04,— 2 one aes Lacy — lia (13.219) 
= b b~ 3 b z oe! . 
where: 
14 (EB) = 1° up (tc Ea — Eesa) (13.220) 


19(H) = °°°4up,(tcHa — Hea) (13.221) 


13.13. Electric Conductivity and the Propagation Equation for D“ 493 


are the electric field and the magnetic field currents. The constraint and the 
propagation equations correspond to the 3-d equations of the electromagnetic field 
as follows: 


divE = —p/e 
divH = 0 


h®” Eq.) — 204 Hy = —2 
h® Hy. — 720" Eq =0 


+ ¢ 


WBE? = (04, + 0%, — Fon4,) B+ 10%) j) @ H=-lit+ivxH 
ng H® = (04, +o, — 30n%,) Hb — LI“() @ Ho-lyxk 


13.13 Electric Conductivity and the Propagation Equation 
for D® 


The electric conductivity is defined by Ohm’s law which in its simplest form is 
stated as follows”’: 


nay? = j? =kE® (13.222) 
We consider the propagation equation of D*: 
ayyb a a 2 a b a a 
h,D’ = Peres es D’+I°(A) —- j 
and contract with D@ to get: 


: 2 
De Dy= (04, + w4, — on", ) D? Da + Dal“(H) — kDa E*. 


Solving this for k we find: 
—_ : 4 6,D°D? — 26D? +E I°(H) 
DgE* |, 2 : 3 7 


2°We set c = 1. 


494 13 The Electromagnetic Field 


The term: 
EqI*(H) = uy [tic Ha — Hea] Ea 
= ee upticHa Eq = 7004 up Hed Ea 
= —UgS* — n??°4 a, Hg Ea 


where S¢ = int? cd 4, E-Hgq is the Poynting vector. We conclude that according to 


Ohm’s Law the electric conductivity is given by: 


1 ee 2 
k= a |— 50 + onsD"DP 30D° ig St ity HaEa 

a 
(13.223) 


We assume a homogeneous and isotropic material so that D¢ = e E“. Replacing 
in (13.223) we find: 


we dD 1 é : 
k= -@+ 56) +5 [ =(E?) + e0qpE*E? — itgS* — nt up Hea Ea. 
(13.224) 
This relation can be written differently. Indeed: 
ae Hea Eq = (n° upHeEa) a _ 77° 4 uy. He Eq _ 7004 up He Ea:d 
= —S4 = 1 (wpa — ypu) HeEa — Up He Ea.a 
= —S4 = 1“ wpa Ea He — S?ity — Puy He Ea:d- 
The term: 
9° wg Ea He = 204 ny ars’ @ Eq He =-—2 (5256 - 5,52) u’ ow E,H. = 0. 


Replacing in (13.224) we obtain the final result: 


2 ¢ 1 
k= € sk s0e) a | 5ce 0qpE“E? + S44, + 1a He Eo . 


13.14 The Generalized Ohm’s Law 


Ohm’s Law concerns the current due to the motion of a charge in a medium in 
which there exists an electromagnetic field. The standard form of this law in a 
homogeneous and isotropic medium is: 


jt =kEH (13.225) 


13.14 The Generalized Ohm’s Law 495 


where the scalar quantity k is the electric conductivity of the medium and E is the 
electric field. This expression of Ohm’s Law is not general in Newtonian Physics 
because it takes account only the conduction and the convection current, whereas it 
is known that in a conducting medium there are more types of electric current. One 
such current is the Hall current. 

In Newtonian Physics Ohm’s Law which incorporates the conduction current and 
the Hall current is defined by the following relation: 


j=kE+ajxB. (13.226) 


The new coefficient A is called the transverse conductivity. 

In this section we determine the relativistic form of Ohm’s Law when the Hall 
current is taken into consideration. 

The technique we follow in our calculations is a good working example of 
how one can transfer an expression of Newtonian Physics into a corresponding 
expression of relativistic Physics. This technique consists of two steps: 


a. Write the Newtonian equation in tensor form 

b. Develop a correspondence between the Newtonian and the relativistic tensors and 
transfer this expression in covariant form in Special Relativity. There is a (small) 
possibility that the relativistic expression is not determined uniquely from the 
corresponding Newtonian, hence before one accepts the result it is advisable to 
examine its physical significance. 


The tensor form of the Newtonian expression (13.226) is: 
jh = kE* + deh”? jy Bo. (13.227) 
We note that the current j“ is involved in two terms therefore we have to solve this 
relation in terms of j“. We multiply with the Levi-Civita antisymmetric tensor and 
get: 
Eyors” = CporkE” + Atuoee””? fy Bo: 
The term (see (13.105)): 
eporel'? = 805? — 5P5" 
thus: 
Euot JY = EpotkE" + (jo Br — jr Bo). 


We multiply with the magnetic field B* and find the expression: 


Eyor jt B™ = EyorkEMB™ +A [4 BB — Bj]. (13.228) 


496 13 The Electromagnetic Field 


Replacing in the original equation (13.227) we find: 
1 
Ua — kEa) = keyor EBT +2 [i -B)Bg — Bo. (13.229) 


We note that in this expression the current appears in the inner product j- B. We 
multiply (13.229) with B, and find: 
j-B=kE-B). (13.230) 


Replacing we get the required Newtonian tensor expression*”: 


(1+ A7B?) jy, = kKEy — Akeyor E° B® + A7k(E- B)B,. (13.231) 


Having found the tensor form of the law in Newtonian Physics we continue with 
its relativistic generalization. The first step to take is to find the corresponding four- 
vectors. For the current we have the four-current J“. Concerning the coordinate 
system we consider a relativistic observer u“, who interacts (observes) the electro- 
magnetic field. We 1+3 decompose J“ wrt u“ and get: 


J? = pu* +h4,J° (13.232) 
where p is the charge density as measured by the observer u“. Subsequently: 


a. We consider the correspondence: 


d 
Euot — —Nabcdu 


b. We identify the spatial part h“, J > with the 3-current ju which we calculated 
above. 
Finally we obtain the required’! relativistic form of the generalized Ohm Law: 


1 


Jt = a 
pu + FRB.) 


[xz 4+ rkn@ EpueBa + Pk(E*B)B" |. 
(13.233) 


30 proof using the classical vector calculus is the following. We take the cross and the inner 
product of (13.226) with B: 


jx B=kK(E x B) +A(j- B)B — AB?j, j-B=k(E-B). 


Subsequently we replace in (13.226) and get the required expression. 
31 See Bekenstein and Oron (1978) Phys Rev D 18, 1809. 


13.15 The Energy Momentum Tensor of the Electromagnetic Field 497 


Equation (13.233) gives the four-current in terms of the (relativistic) electric and 
magnetic fields and takes into account the conduction current as well as the Hall 
effect. 


13.15 The Energy Momentum Tensor of the Electromagnetic 
Field 


Consider an electromagnetic field described by the tensors F*“?, Ka» which satisfy 
the field equations (13.149) and (13.150): 


Cae i (13.234) 
pe =o (13.235) 
where J“ is the four-current density vector and F*“? = ST ei Fis the dual 


bivector of F®’. The four-force on the current J“ due to the electromagnetic field 
is: 
| ab | ab 1 b | ab : 
Fa = -F® J = —FOK,o, = — (FA K,°) | — <P K,’. 


We consider the tensor?2 


1 c 1 cd 
Tap = —— | FacK, + 7800 (Feak ) (13.236) 


and compute its divergence. We have: 


1 1 
a 4b Cc He a) + 4 ee .a 


1 1 
= Fat SFUK, + xk (Feak™) (13.237) 


ia 


The term: 


3? An equivalent definition of the energy momentum tensor for the electromagnetic field is 


1 1 
Tay = [Fockis = 85 (Fuk) 
Cc 4 


Obviously the two definitions are the same due to the antisymmetry of Kap. 


498 13 The Electromagnetic Field 


1 1 . 1 
oe Ky = ae [F%, _ F* »| K" = ed [ Fub,c ce Fea,b] 


1 
= KH Phe (13.238) 


where in the last step we have used Maxwell equation in the form (13.151). 
We specialize our study to a homogeneous and isotropic material. In this case: 


a a 1 b b i 
D® = €E® => —Kqgpu = €Fapu & —Kaqo = € Fao (13.239) 
c c 
1 
B® = nH" => nabea (cP = ux!) u? =0& Fyy = ucK py. (13.240) 
c 
Replacing in (13.238) we find: 


1 1 1 
5 Kn Fed = 5 Kn Food - 5 KO Fu. 


Replacing in (13.237) we obtain the final result: 
TY, =—Fa (13.241) 


that is, for a homogeneous and isotropic material the divergence of the tensor T@? is 
the four-force on the current J“. Because the vacuum is a special homogeneous and 
isotropic material this relation also holds in vacuum. However there is a difference 
between the vacuum and a homogeneous and isotropic material. Indeed in vacuum 
the tensor T“” is symmetric whereas for a homogeneous and isotropic material this 
tensor is not symmetric. 


13.15 The Energy Momentum Tensor of the Electromagnetic Field 499 


Let us compute the expression of T@? in terms of the vector fields of the 
electromagnetic field in a homogeneous and isotropic material. The invariant: 


1 1 
F’ Kay Fee + Eu") + ct Ba x 


1 1 
E (—Daup + Dota) + iirc | 
Cc Cc 


2 1 

= <a (E“ Da)(u? up) + a0 Nabrs Beta Hu! 
2 2 

== Se j=5 (sc6¢ = 5480) BeugH" u' 
Cc 


~  ¢ 
2 2 

= —=(ED,) — (BeH)(uau") 
Cc Cc 


2 


= [Ba H® — E“ Da]. (13.242) 


The term: 
c 1 1 die 
FacK y = m7) (—Eque + Ecua) + a MacdeB u 


1 1 
E (—D°uy A Dpu‘) 4 1 pes" | 


1 2 c 
= = [-EuDo(-0?) — (ED ua 
1 cyyr,,s 1 cpd_e 
oF Pa lebrs E HH wug — ca NacdeD Bruuy 


1 
+ 3 "Nacde notes n fb Hus Bou? 
Cc 


The term: 


Nacaent"® Nfb Hus Bou? = —neaaen!"*n fo Hrs Boue 


= [4 9783 52 858" + 87 as6f — 8 8F 5 + 6885 87 535732 | nb Hrs Blue 


= nab(H, B’)(—c*) — Hg By(—c”) — (A, B" uautp 


it 
=< [nan B* — (Ar B" uate + Hs : 


500 13 The Electromagnetic Field 


We introduce the Poynting four-vector and the polarization four-vector by the 
formulae: 


1 
-nabed EP Hut (13.243) 
Cc 


1 
=Nabea D? Bout (13.244) 
Cc 


and have: 


: 1 
FacKS = a [Eas + Hq By 


1 c c ec 1 
C2 [E-D + H.B ] Matty — Nap( HB") — a2 Shula — Paup}. 


Adding these results we find: 


1 


1 
Tab = —— [Fuck +r > Nab (Fouk*)| 
c 4 
_ _ Ea Dp Aa By 4 [E.D* A.B) Uqguy — Nab( HB) = = Spa — Pautp 
ce +5 Nab [BcH® — E°De] 
- _ Ay Bp [E.D* A.B! Uqguh — 4 Spita — Pau 


| 


1 
a2 Sola + Paup |. 


(13.245) 


—5nab [B-H® + E°De] 


This can be written differently as follows: 


1 1 1 . ; 
Tab a (—E,D, + Ha Bp) 4 5 hap 4 aa Maly (H.B f E.D°) { 


This tensor has the following irreducible parts wrt the four-velocity u“ (see 
(12.11)): 


1 
w= T,putu? = ; (H.BS + E-D°) (13.246) 
Sa = —C7h° Tp? (13.247) 
P, = —h? Tu“ (13.248) 
czd 1 1 c a 
Map = Nghs Ted = — | EaDo + Ha By — shav (H-B°+E-D°)|. (13.249) 


It is obvious that the tensor 7,» is not symmetric. In order to find the physical 
significance of each of the irreducible parts we consider the proper frame of the 
observer, ©* say, in which the four velocity u“ = cd. Let us assume that in this 
frame the electromagnetic field is given by the 3-vectors E, B, D, H so that from 
(13.114), (13.115), (13.129) and (13.130) we have: 


13.15 The Energy Momentum Tensor of the Electromagnetic Field 501 


E¢ = (0, E)5+, Bf = (0, B)s+, D* = (0, D)s+, H¢ = (0, HW)s+ 


whereas from (13.243) and (13.244) we compute*? for the spacelike four-vectors 
8°, P% 


1 
S*= (0, —MpcoE Hu") = (0,E x H)y+ (13.250) 
1 
P* = (0, —n4,.9D° Bou’) = (0, D x B)5+. (13.251) 
mle 
Replacing the expressions of E“, B®, D®, H® in (13.246) and (13.249)we find: 


1 
w = 5(H-B+E-D) (13.252) 
1 
Mab = ~~ [Eu Dv + Hy By — w8,y] 548 (13.253) 
Cc 


where in (13.253) we have assumed a Cartesian coordinate system. The quantity 
w expresses the energy density of the electromagnetic field in X*, the vectors 
S“, P“ measure the momentum transfer and the tensor Igy is the stress tensor for 
the electromagnetic field when considered as a ‘fluid’. 


Exercise 13.15.1 Show that for a homogeneous and isotropic material of dielectric 
constant € and magnetic permeability ju: 


i. The polarization vector: 


P* = epS* (13.254) 
il. The energy density: 
1 2 2 
w= 5 (WH + ¢&E*) (13.255) 
iii. The stress tensor: 
1 v 
Mab = —3 [eE, Ey + uA Hy — wbyy]d4dp (13.256) 


iv. The energy momentum tensor: 


1 1 1 1 
Tab = 3[-(EaErtutath)+5 (i + a) (ui? + cE?) + Spita +epSaup |. 
(13.257) 


33Recall that u° = c, ug = —c. 


502 13 The Electromagnetic Field 


Exercise 13.15.2. The energy conservation law for charges and the electromagnetic 
field in a homogeneous and isotropic material (c = 1). 


i. Consider the vector identity: 
div(A x B)=A.-(V x B)+B.-(V x A) (13.258) 


and using Maxwell equations for a general medium show that: 


div(E x H) = (5 a 


aB aD 
H+ +E) j-E. (13.259) 


ii. Show that for a homogeneous and isotropic medium of dielectric constant ¢ and 
magnetic permeability |: 


a 
divS = —_ _j-E (13.260) 


where w = 5 (WH? +E”) is the energy density of the electromagnetic field and 
S=E x His the Poynting vector. 

iii. Consider the current to be a conduction current j =pu where p is the charge 
density and w is the velocity of the charges. Then j- KE =pE-u = F -u where F 
is the force density on the charge. The quantity F-u = where T is the kinetic 
energy density of the charges, so that (13.260) reads: 


d(w+ T) 
ot 


divS = — (13.261) 


iv. Consider a volume V in which the charge has kinetic energy T and the 
electromagnetic field energy density w. Then integrate (13.261) to obtain: 


T 
— av = [ aivsav. 
V ot V 


Apply Gauss Theorem to write this equation as: 
w+T=— § S-do. (13.262) 


In this form this equation expresses the conservation of energy for charges 
and the electromagnetic field. As a consequence of (13.262) we may interpret the 
Poynting vector as the flux of energy per unit time through a unit area oriented 
normally to the Poynting vector. However this interpretation is not fully justified by 
Maxwell equations because if we add to the vector S another vector S' satisfying 
the condition divS' = 0 the above result does not change. 


13.15 The Energy Momentum Tensor of the Electromagnetic Field 503 


Exercise 13.15.3. The momentum conservation law for charges and the electromag- 
netic field in a homogeneous and isotropic material. 


i. Consider the following identity of vector calculus: 
grad(A-B) = (A- V)B+ (B- V)A+A x curlIB + B x curlA (13.263) 


and take A = B to find: 
1 
sve = (A-V)A+A x curlA. (13.264) 
Show next that the term: 
a i ; 
(A- V)A = 5,4 AaAo)es —A-divA (13.265) 
x 
where A = Ag@q in the basis {€g}. Conclude that identity (13.264) is written: 


a 1 - 
A-divA—A x curlA = aa (40a se =A») ep. (13.266) 


xe 


ii. Consider Maxwell equations for a general medium and show that: 
. oO 
BA SE DE) = Be ee), 


From Maxwell equations we also have divD =p, divB =0 where p is the 
charge density. Use these and the last relation to show that for a homogeneous 
and isotropic material of dielectric constant ¢ and magnetic permeability |: 


a 1 r ; 
Fo (EDs -+ Hy Bh — 5(H-B+E-Dydn5) & = pE+)xB+P 


where —P is the polarization vector. The term pE + j x B = F where F is the 
Lorentz force density. 
iti. Show that the last relation can be written: 


OTap , oP 
=-—F 13.267 
5x5 ep + a (13.267) 


where the tensor: 


Tgp = —(EgDp + Ho By) + w8gp. (13.268) 


504 13 The Electromagnetic Field 


iv. Consider a volume V and integrate (13.267) over the volume V to find: 


ATi. a 
/ Edy = - f Fav + 5, f Pav. (13.269) 
ax? at 
V V 4 


The term [ FdV = ap where p is the linear momentum density of the charges 
Vv 
enclosed in the volume V.** Write (13.269) in the form: 


dap , 0 
ep -do = — | (—p+P)dV (13.270) 
Ss Ox ot Jy 


and conclude (a) that the quantity am 
infinitesimal surface area do normal to the vector @p and (b) the quantity P is 
the field momentum density. 


v. Show that for a homogeneous and isotropic material the polarization 


# @, - do represents the force acting on an 


2 
P=euS =*S where n is the index of refraction of the medium. Note that in 
this case the Poynting vector has also the interpretation of the field momentum 
density. Furthermore: 


Tap = —€ Eg Ep — Hg Ay + wba 


i.e. Tap is symmetric. Note that even in this case the energy momentum tensor 
Tap is not symmetric (it is symmetric only for vacuum). 


The energy momentum tensor for the electromagnetic field we have considered 
is due to H. Minkowski. Its derivation is based on the assumption T = —F* 


and not to symmetry (i.e. T¢? = T°). Because in general we ‘assume‘ the energy 
momentum tensor to be symmetric, soon after Minkowski, M. Abraham suggested 
another energy momentum tensor which was similar to Minkowski — in fact it is 
based on the Minkowski energy momentum tensor — and is supposed to hold within 
a medium only (in the empty space they coincide). Both tensors are correct and still 
today there is a discussion going on as to which energy momentum tensor should 
be considered as more appropriate. Abraham uses the equation figs = F* to define 


the four-force, which inevitably is different form the Lorentz force F,pu?. 


34The partial derivative 2 indicates that the volume V is comoving, that is does not change in 


; ot 
time. 


13.16 The 1+3 Decomposition of the Tensor Ty, 505 
13.16 The 1+3 Decomposition of the Tensor T,, 


The energy momentum tensor ¢y7“? can be 1+3 decomposed wrt the vector uv“ as 
it is done with the general (not necessarily symmetric) energy momentum tensor. To 
do that we note that 7“ in (13.245) can be written as follows: 


1 1 
Te — 5 (cE AB utd? + geek RB ne 4 ut 8” en" iS) 
1 
+3 (FE +AB7)h? — (cECE? +\0H°H’), 


= eee - Bent? + Age u? a ae. =r sed 


where 

I yi) 2 

Hem = 58 4 2B") (13.271) 
1 2 2 

pn = gee +AB*) (13.272) 

dem = S“ (13.273) 
1 

Tim = 3 (E> + AB*)AY — (CE“E? + XH“H?). (13.274) 


We note that 72? is traceless as it is expected. 
One application of the above result is the determination the equation of state for 
isotropic radiation (for example the background radiation in the Universe). In this 


Case: 
20.52, =0 (13.275) 


and the energy momentum tensor becomes: 
1 1 
pet 5 (cE 4B uu? + Zee +AB7)h? (13.276) 


from which follows the well known result relating the pressure and the energy 
density of the electromagnetic field: 


1 
Pem = ZHem- (13.277) 


3 
Closing we should remark that the four-dimensional formulation of electromag- 
netism is not an academic luxury but a practical necessity because this formalism 
leads safely to correct and consistent results which can always be translated into 
practical working equations (i.e. written in terms of 3-d quantities) for a given LCF. 


506 13 The Electromagnetic Field 
13.17 The Electromagnetic Field of a Moving Charge 


The determination of the electromagnetic field produced by a moving charge in 
a LCF © is an important problem with many applications. The solution of this 
problem with Newtonian methods is difficult and the results cannot be checked 
reliably. On the contrary the solution within the relativistic formalism gives a 
complete and substantiated answer and exhibits the power and the usefulness of 
this formalism. 

We consider a charge g with world line Q whose equation is c'(t), where T is 
the proper time of the charge. We wish to determine the electromagnetic field due to 
the charge at proper time t at the spacetime point P with coordinates x'. Suppose 
that at the moment Tt the position vector of the point P relative to the charge is R’. 
Then: 


Ri =x! —cl(r). (13.278) 


We make the following two assumptions: 


1. The electromagnetic field created by the charge propagates with speed c. 
This implies that the point P is on the light cone of the point c’(t) of the 
world line. Therefore the position vector R’ is a null four-vector: 


R'R; = 0. (13.279) 


2. The electromagnetic field created by the charge in the proper frame of the charge 
consists only of an electric field, which is spherically symmetric with center at 
the charge. 

This means that in the proper frame of the charge, X say, the potentials of the 
electromagnetic field are (SI system of units): 


Se, 
4reg r!” 


¢' =0. 


Then the four-potential Q' in D’ is given by: 


ods ds 
a= (#5 ; a=(-7 4.0) . 
0 a JTEQ cr y 


Suppose that in &’ the position vector R' of the spacetime point P has 


ct’ 
° d 


components R! = ( ) . Because R? is null we have: 
y’ 


; ! / 
—2t? 442 =05 ct! =r thus Ri = [ ) 
dy’ 


13.17 The Electromagnetic Field of a Moving Charge 507 


where r’ is the length of r’. The four-velocity of the charge in ©’ is u’ = (;) 


and a simple calculation shows that Q2; can be written covariantly as follows: 


9, 1 qu; 


= 13.280 
: 4ire9 cD ( ) 


where in order to save writing we have set D = R/ uj=—r'e. 
Having computed the four-potential we have practically solved the problem, 
because the tensor Fj; is given by: 


Fj = —Q),j; + Q;,3. 
In order to compute the derivative (2; ; at the point P we note that: 


0 Cha d d 


7 => ; = T : 
ax'|p Ax!\erydt ‘dt 


The quantity zt; is computed from R' as follows. From (13.278) we have R;, j= 
nij — uit; so that: 


; : , R; 
R/R; =0> RI Rj, =0> RI (nij —ujti)=0S T= 7% 
Accordingly the derivative of the four-velocity: 


du; 


Uij = oh ae = Tui = p iti 


and that of the quantity D: 
C2 1 sy. 
se ee aa Page uk R;. 
Using these expressions we compute: 


1 sq 1 qe 
ais R. 
4negcD2 '/ 4re9 D3’! 


Qi 5 = 
where: 


i se D. 
S;= {1+ ak Ug | Ui — Ui- (13.281) 
c Cc 


508 13 The Electromagnetic Field 


Finally: 


1 qe 


Fig = -Q),5 + QU = tee, De 


(S;R; — S;Ri). (13.282) 


The tensor F;; contains all the information concerning the electromagnetic field due 
to the charge. Let us see why. 


13.17.1_ The Invariants 


From (13.59) we have for the invariant X: 


i i gi ei [sini - s/R'] 
QE" 2 (are)? DET MY 
1 Pero p Hee 1. yepe 
= SPR? = (S'R; -— S'Ri) 
(47r€9)? D® (RD (4zr€9)2 D® ( : 


(because R* = 0). But: 


2 


hence: 


1 gc” 


= Garey? ae (13.283) 
Similarly from (13.59) we have for the invariant Y: 


1 = .2q2c? a 
0 


1 Pe 
4Y = =te =- 


From the values of the invariants we conclude that the electromagnetic field 
produced by a moving charge in a LCF »&, say, either consists of an electric field E 
and a magnetic field B which are normal to each other and with different strength, 
or an electric field only (as it is the case in &’). 


13.17 The Electromagnetic Field of a Moving Charge 509 


13.17.2. The Fields E', B* 


The fields E', B' are computed from the tensor F; j using the relations (13.110). We 
find: 


j__1 4 j 
EB, = Fijue = eh D3 [siD — (Sju )Ri| 
But: 
Sui =-c? (1 : Ria 
ju =—c + a Uk 
therefore: 


1 qe 1 oy. 2 a, 
F=gops|(1+ aR (Du; re, Ri) — =i |. (13.284) 
For the magnetic field we have: 


1 ‘ 1 1 2qe . q i. 
Sat py Pil 1 SJ RK! RA gk y! 
B, = 70 Mik u= 5 nije S! Rew = ize De nije Rou. 
(13.285) 


It is easy to prove that the fields E', B! are (Lorentz!) perpendicular: 
E'B; = 0. 
This result was expected because in the proper frame of the charge the fields 


E', B' coincide with the fields E*, B™, whose inner product vanishes, therefore in 
this system E’ B; = 0. However this relation is covariant hence it is valid in all LCF. 


13.17.3 The Liénard—Wiechert Potentials and the Fields E, B 


Consider an arbitrary LCF & in which the charge has velocity u and acceleration a. 
We calculate in & the scalar and the vector potential as well as the fields E, B. 
In the LCF & we have the following components of the involved four-vectors: 


aC ) a Aa) 
r = yu 3s agu+ ya ' 


510 13 The Electromagnetic Field 
where iu! is the four-acceleration of the charge and ay = y. We compute: 
D= R'u; = —y(cr—r-u) 
R* iy = —ag(er —¥-u)+y?(r-a) = , ae y*(r-a). 


The four-potential is given from the covariant expression (13.280): 


1 qui 


~ Arey cD 


The zeroth component and the spatial part of the four-potential are the scalar and 
the vector potential respectively. These potentials we call the Liénard—Wiechert 
potentials. In order = compute these potentials it is enough to calculate the 


components of Q! = a qe Using the above results we find easily (see (13.43)): 
1 q 1 qu 
— : A= 13.286 
= Deeg (1 — 4) Ameo c?r (1 — 4) : 


where u, = “* is the component of the velocity in the direction r. 
From the Liénard—Wiechert potentials we cannot compute (directly) the fields 
E, B from the relations: 


aA 
B=vxA, E=-ve+> 


and we must work with the relativistic formalism. From the components of the four- 
vectors in 2 we compute*>: 


Sj = (-= [e? +y7(r- a) | 5 {[e? +y(r- a) | uty2(cr —F- wal). 


For the electric field E we have: 


2 f= (Soy —k9S),) 
ia ae 4rey y3 (cer —r-u)? ee neste 
1 qe Ae ae 
= + r-a)|r 

Arey Caer c [< yes) 


5 {[e? +/7(r- a) | u+y*(cr—r- wa} |. 


35 §; is defined in (13.281). 


13.17 The Electromagnetic Field of a Moving Charge S11 


After some simple algebra we find: 


a q 
~ Areo cy2(er — r+)? 


{[e? + yr : a) | (cr—ru)—y7r(cr —r- wa}. 
(13.287) 


This relation can be written in a form which singles out completely the 
part of the field which is due to acceleration. Indeed using the identity A x 
(B x C) = (A- C)B — (A - B)C we write the electric field as follows: 


ull q [Sc ) rx [(cr—ru) (13.288) 
~ Ae (er —r- uy? a ru x [(cr—ru) x a] |. 


For the magnetic field we have: 


B = (—F3, —F31, —F 2) 


qc/Améo 
= y3(er = u)? (S2.R3 S3Ro, S3R1 = S51 R3, S, Ro _ RS) 
4 
9) 
y3(er —¥r-u) 
= qc/4r€ (% [2+ 72@-a)| uty7(cr —r-wal xr) 
yi(cr —r-u)? \c? 
1 
= —(r x E) 
cr 
that is: 
lr 
B=--xE. (13.289) 
cr 


A special case with much interest is the motion of a charge with constant velocity 
(a = 0), e.g. the motion of free electrons within a conductor. In this case relations 
(13.288) and (13.289) give: 


: qe" (r — ut) (13.290) 
= U. 7 
Arey y2(c2t —r-u)? 
1 qc 
(u x r) (13.291) 


~ 4re9 y2(2t —r- uy? 


where we have used that r = ct. 
These expressions have an easier physical interpretation if they are written in 
terms of the angle 0 of the vectors u, rin &. Indeed we have: 


cr —Yr-u=cr —rucos@ =cr(1 — Bcos@) 


512 13 The Electromagnetic Field 


Fig. 13.3 Motion of a charge 
in a general electromagnetic 
field 


Fig. 13.4 Angles in the 
proper frame of the charge 
and in & 


hence (€99 = 1/c”): 


Be ae (13.292) 
~ Areq y2r3(1 — Bcos@)3 : 
J 
ic aS (13.293) 


~ An y2r3(1 — Bcos0)3 


where r’ = r — ut is the vector in the 3-space of X connecting the charge with the 
point in 3-space where the electromagnetic field is created. 

In order to draw geometric conclusions from (13.292) we have to express the rhs 
in terms of the vector r’. If we call ¢ the angle between the vectors r’, u in © we 
have Irj | =r'cos@, |r’,| =r’ sing (see Fig. 13.4). 

We note that ut = ur/c = Br and calculate the quantity: 


=? [« cos@ — ut)? + (1 — B2)r2 sin” 6] 


= yr? [! + (1 — sin? 6) — 28 cos 6 | 


13.17 The Electromagnetic Field of a Moving Charge 513 


= y’r*(1 — Bcos6)’. 
But in the proper frame of the charge: 
yy +r? = yr? cos? p +r? sin* p 
= y’r?[cos* + (1 — B*) sin? ¢] 
= y?r?(1 — B* sin” $) 
and finally: 


r(1 — Bcos@) =r'\/1 — B? sin’ ¢. 


The electric and the magnetic field are written in terms of the elements of r’: 


ee qr (13.294) 
~ Are y2r3(1 — B2 sin? 6)3/2 
/ 
a za (13.295) 


B= F 
An y2r3(1 — p2 sin2 $)3/2 
From these relations we infer the following: 


1. The electric field is not isotropic (consequently spherically symmetric) and (as 
we show below) its strength is largest normal to the direction of the velocity and 
takes the smallest value along the direction of the velocity. Consequently the lines 
of force of the electric field are denser in the plane perpendicular to the velocity. 

2. The magnetic field which is produced from an electric current in © is normal 
to the plane defined by the current and the point at which we are looking for the 
field. Furthermore its strength is proportional to the value of the current (j = gu). 

3. In the Newtonian limit y ~ 1, B * 0 and also ut < ct so that r & r’ and 
relations (13.292) and (13.293) reduce to: 


1 gr’ 
Lo jx x 
Baa (13.297) 


The first gives the electric field of the Newtonian approach and the second is the 
celebrated Biot-Savart Law. *° 


36We recall that in the Newtonian approach the magnetic field of a current i satisfies two Laws. The 
Ampére Law ¢ B- dl = jzoi and the Biot-Savart Law dB = io fy dl x r where dl is an elementary 
length along the conductor and r is the point where one calculates the magnetic field (see Fig. 13.5). 
Ampére’s Law is used in the cases the magnetic field has high (geometric) symmetry whereas 


514 13 The Electromagnetic Field 


Fig. 13.5 Biot-Savart Law 


In order to study in depth the anisotropy of the electric field, which is a purely 
relativistic phenomenon, we write (13.294) as follows: 


ee es 1 
~ Area r3 y2(1 — 2 sin? ¢)3/2" 


(13.298) 


The first term in the rhs: 


- 1 gr 
@) ~ 4a €9 r’ 3 

is the isotropic Newtonian electric field which is created at the point P in & by the 
charge q. The second term is due to the effect of anisotropy and it is of the order 67, 
hence absent in the Newtonian limit. 

In order to estimate the effect of anisotropy we introduce the quantity f(6,¢) = 
— which we plot as a function of @ for various values of 6. This plot 
y? (1B? sin? #)2 
is shown in Fig. 13.6 where it is apparent the anisotropy of the quantity f(6, #) in 
= and its dependence on the factor f (the speed of the charge in X). 

We note that when 6 — 0 the curve tends to a straight line parallel to the x- 
axis, which means that in the Newtonian limit there is not dependence on the angle 
¢ and the field becomes isotropic. For relativistic 6 the strength of the field in the 
equatorial plane tends to zero whereas it tends to infinity near the value @ = +4 
(6-function). 

In the following we discuss two examples of motion of a charge in a LCF and 
calculate the resulting electromagnetic field. The first example concerns uniform 
motion and the second uniform circular motion. 


Example 13.17.1 A charge g in moving in a LCF & with constant velocity u. If the 
charge is at the origin O of & the moment t = 0 of ¥ calculate the electromagnetic 
field at the point P of X. 
Solution 

Let r the position vector of P in &. Because the electromagnetic field (in empty 
space!) propagates with speed c, the field created by the charge when it was at the 
origin O of & will reach P the moment t = r/c of X. But then the charge will 


the Biot-Savart Law is used in more general cases in which the magnetic field is computed by 
integration along the conductor. 


13.17 The Electromagnetic Field of a Moving Charge 515 


a 


¢ mn 
beh ee 


ae 
S 


4 


{ 
y 
fl 


“100 —50 50 t:t«=<‘~CS'‘COSC(<i‘ 


Fig. 13.6 The anisotropy of the electric field 


be at the position ut in & and the point P will have position vector wrt the charge 
ro =r—ur./ 

In the proper frame &’ of the charge the electromagnetic field has only electric 
field which is given from the relation: 


1 gry 
hecie  e (13.299) 
4 B 
JT EQ sy 


In order to calculate the electric field in & we consider the Lorentz transforma- 
tion. However we do not need to do that because we have already computed this 
field in (13.290), i-e.: 


1 qc 


— r—ut). 13.300 
Are few ( ) 


As an instructive exercise let us compute the electric field directly. From the 
transformation (13.54) of the fields and taking into account that By, = 0 and 


/ = / / — Ww 
Mie = pee Se = Es 


37For the observer on the charge the field at the point P the moment t appears to come from 
the origin O of &. Because in the standard non-relativistic approach to electromagnetism time 
is understood in the Newtonian approach, the origin O is referred as the retarding point. This 
terminology has no place in the relativistic approach where time is a mere coordinate and can take 
any value depending on the frame. 


516 13 The Electromagnetic Field 


we find: 
lq iy q 
Bio Ma eas a PB NE! = drrey rt 12 
E.y =yvELy =y7— a ! yy! 
<e. Am € 73, BLD! ~ Anrep oo TLE 
The electric field in & is: 
Ey = Eys + Ely = — . yI5.- 
4reg ri, 


It remains to compute the length r¢, 2, in X. We have: 
2 ru ee: 2" 
ry Noe trie = ris +rfy=y? (—-ut) 7 as =yR 


where R = (t - ut)” + ois: Finally: 


1 
a (13.301) 
Arr eo a5 


y-R 


Ey = 


This expression appears to differ from (13.290) but this is not true. Indeed we 
note that: 


wd 
= (ry — ut)? + (1 — B’)rt = (7 cos@ — ut)” + (1 — B”)r’ sin’ 6 
= r*[cos* 6 — 2B cos6 + B” + sin” 6 — B? sin’ 6] 
= r*(1 — Bcos6)* 
therefore: 


R=r(1— fcos@) = Ee eeal: 
c 


Replacing in (13.301) we recover (13.300). 
We see once more that the anisotropy of the electric field increases as the angle 
¢ tends to 7/2. For the limiting values @ = 0, 5 we have: 


| oq m l qy, 
E 13.302 
Are yre 72,5 m? + Are fe pete ( ) 


E| = 


13.17 The Electromagnetic Field of a Moving Charge 517 


The magnetic field in & must satisfy the relation (why?): 


from which follows again relation (13.297). 


Example 13.17.2 A charge gq moves in the LCF © along the periphery of a circle of 
radius R with constant angular velocity w. Calculate the electromagnetic field in & 
at the center P of the orbit. 
Solution 

We consider the origin of & to be at the center of the orbit, so that the 
coordinates of the point P at which we are interested for the electromagnetic field 


are: x'(t) = O and c/(t) = ( e ) . The four-vector R! is R! = (*) where 
Re, y r/s 

r =ct,r=-— Ré,. But R’ is null, therefore r = R. The 3-velocity of the charge 
in D is u = wRép and the 3-acceleration a = —w* Ré,. These give for the four- 
velocity u! = ( on ) and for the invariant D = R'u; = —Ryc. We replace 

ywReg 
these in (13.280) and calculate the four-potential: 

; 1 qui qd Cc 

Q' = = n : (13.303) 
Areq De 4egRc? \wReg } 


Having found the four-potential one calculates the antisymmetric tensor F;; and 
subsequently the electric and the magnetic field. However one is possible to work 
directly, by replacing: 


r=— Ré,, u=wRé, a = —w’ R6, 


in the general relation (13.288). Indeed doing that we find for the electric field: 


1 % A 
B= — Fa FacaRE [ORG + Rate? + ypu?) 
oe 
1 qd x a 
=~ Fieeg yar [@ + BC + Bye] 
1 qd x x 
=~ Za pag le + 6%]. 


In order to find the Newtonian limit we set y = 1, 6 = 0 and find the Coulomb 
field: 


1 ‘ 
= ae 
4rrey R2 


518 13 The Electromagnetic Field 


Concerning the magnetic field we have from (13.289): 


p-18xk be E +B “en 
= -e, X = sz Oey XU e 
c. Arey y2R2c ‘ ee 
1 q 2a 
=— é 
4megc y2RIPY 
__ HoIe 5 
4a R 


which coincides with the previous result. 
This expression can be compared with the well known result concerning the 
electromagnetic field of a circular conductor if we consider the charge as the current: 


RR sah 
T 2 


Then: 


which is the well known result of non-relativistic electromagnetism. If we introduce 
the magnetic dipole moment of the loop: 


p= Ri (13.304) 
we have for the magnetic field the expression: 


Ho 
B= a RB LL. (13.305) 

Example 13.17.3 Ina LCF & two charges q; and q2 start moving uniformly along 
parallel directions with the same speed u. Calculate (in &!) the force between 
the charges when they are moving (a) along the same direction (b) in opposite 
directions. 
Solution 

From Fig. 13.7a we have: r’9\(t) = r’21(0) = lh} = constant and @ = 4. 
Replacing in (13.298) we find for the electric field which is due to the charge q at 
the position of the charge q2: 


fe 1 q\ bi = 1 qiy 
_ 3/2 ~~ 3 
Arr e€o v2, (1 _ p2) / 4m €0 by 


E> hj = Alo 


— lay 
where A = Trey 


13.17 The Electromagnetic Field of a Moving Charge 519 


Fig. 13.7 The force between parallel currents 


The magnetic field at the position of the charge qz is: 


1 A 
Bo = =u x Ey = su x I 
Cc Cc 


The force on the charge qz is the Lorentz force: 


1 1 
Fo} =@ [Bai + ux Ba, | =QMaA E + aux (u x I)| 


= mA( — phi 
_ ol N42) 
Ai €9 yi, : 


2nd Solution 
In the proper frame X, of charge q; the charge q2 is fixed therefore the applied 
force is: 


(the Jo; does not suffer Lorentz contraction because it is normal to the relative 
velocity u). Obviously F2;,5,1u thus F215, -u = O and the four-force on the 


charge qp is: 
i= ( . ) ; 
yFas,/>5, 


520 13 The Electromagnetic Field 


In order to calculate the force F215 in & we use the Lorentz transformation with 
speed u. It is left as an exercise to the reader to show that one finds the result of the 
first solution. 

(b) From Fig. 13.5b we have for the counter parallel motion: 


Qut 
r,(t) =k — 2ut, ri, = (By + 4020, an 
(Ba + 42? 


The electric field created by the charge q; at the position of charge qo is: 


1 M1 
Ez; = 372 (Io1 — 2ut) = A(Ip1 — 2ur) 


Am €0 3 a 
2. 4u+t 
y2 bee + 4u?t? | (1 - pte) 


where A = = a ; a mE The magnetic field is given by: 
[FB +4u202] (1-8 fut ) 


1 A 
Bo = a” x E> = a x Ip]. 
The force on the charge q2 is (see (13.298)): 


Qa, 
“yet 


Fo) = go[Eo1 + (—u) x By] = q2 Alli — 67h] = 
or, replacing A: 


1 1492 


i 4f72 2,273 2 4u21? ae 
Y [lin + 4u t ] cal) P4022 +4u2 12 


Fy; = 


In this case too it is possible to calculate the force in the proper frame of 
the charge g; and then transfer the result in © using the appropriate Lorentz 
transformation. The details are left to the reader. 


13.18 Special Relativity and Practical Applications 


The Theory of Special Relativity is not an luxurious exercise of the mind which 
“helps” us to understand the world satisfying our metaphysical agonies. It is 
a theory for the engineer, a theory which leads us to construct new medical 
devices, new measuring instruments and certainly new energy production plants 
and (unfortunately) new weapons. In order to just touch at this aspect of Special 


13.18 Special Relativity and Practical Applications 521 


Relativity, in this section we discuss an application, which is used directly or 
indirectly in the design of counters of charged particles in the laboratory using the 
electromagnetic field they produce. The requirements we set for the performance of 
this machine are: 


e The reaction time (that is the time interval in which the instrument can distinguish 
between two charged particles) must be small in order to be possible to measure 
fast moving (i.e. relativistic) particles. 

e The sensitivity of the instrument (that is the output which is produced gives for 
the maximum velocity and the minimum charge) must be adequate so that it will 
be possible to count various kinds of particles. 

¢ The instrument must be capable to “see” small regions in space, because the 
radioactive sources used in the laboratory are of small size. 


The above conditions are satisfied if we use the magnetic field created by the 
moving charged particles to produce electrical pulses. In practice this is achieve if 
we place a loop near the orbit of the particle. Indeed the passing of the particle 
creates a change in the flux of the magnetic field through the loop (zero — maximum 
— zero) which produces an electromotive force Es; = — ae at the ends of the loop. 
This potential can be measured relatively easily. 

Based on the above analysis we design the following construction. We consider a 
small plane loop of area dS which we place near the radioactive source (considered 
to be a point) and in such a way so that the source is in the plane of the loop. We 
consider a LCF & with origin the source, we assign the plane x — z to be the plane of 
the loop and assume the velocity of the charged particles to be along the z-axis. We 
also place the center of the loop on the x-axis and at a distance xo from the source. 

The change of the magnetic flux ® is due to the normal component of the 
magnetic field to the plane of the loop, which according to our arrangement is the 
component By. In order to compute By we use the relation B = Su x E and taking 
into account that u = (0, 0, u) we have: 


u . . u 
5 (Exj — Ey) > By = Ex. 


B= 
C2 


Cc 
Hence: 


© = By(xo)dS = SE x(a0)dS 
Cc 


where E (xo) is the x-component of the electric field at the position (xo, 0, 0). The 
electromotive force which is created in the loop due to the passing of the charge is: 


Ame dD — WOE ey ag 
SF ae 


But we have computed (see (13.302)): 


522 13 The Electromagnetic Field 


x0 
Ex (x0) = y 
Ar €o es +4 y2u2t2) 
so that: 
J Ex (xo) _ 3 3q u-xot 
ot ~ Arey (xg a y2u2t2y 


Therefore the electromotive force per unit of loop surface is: 


3 3qpu xot 
Arreo (x2 + y2u22)?? 


AEs =y 


We compute that AZ has an extremum at the moments: 


_ |xol 
oni 


to =H 


Without restricting generality we consider x9 < 0. Then we have that the moment 
fo. =— aE appears the maximum electromotive force and the moment f9,2 = aa 
the minimum. (If x9 > 0 the role of these time moments is interchanged). These 
values are symmetric about the value t = 0. Therefore at the ends of the loop we 
have the voltage of Fig. 13.8 where we have assumed that the voltage pulse has the 
form of a Gaussian. 

Having discussed the basic structure and operation of the instrument — counter 
we continue with its precision. A Gaussian pulse is characterized by two parameters: 
The time interval o between the maximum and the minimum of the pulse and the 
time interval d between two successive pulses (see Fig. 13.9): 

Observation has shown that an instrument can distinguish two successive pulses 
if d > p (see Fig. 13.9a), whereas in the case (d < p) the instrument encounters 
the pulses as one (see Fig. 13.9b). In the counter under consideration the maximum 


A AEs 


Xo > 0 


Fig. 13.8 The form of the voltage pulse 


13.18 Special Relativity and Practical Applications 523 


t—d—i tKdH 


(A) (B) 


Fig. 13.9 Precision of an instrument 


occurs the moment fo,; and the minimum the moment fo,2. If we consider that the 
particle is radiated the moment f = 0 = 5 (to,2 + 10,1) then we have that p = |fo,1| = 
Fol Using this result and the fact that the speed of the electromagnetic field covers 
the distance x9 with finite speed c, we define the precision of the instrument as 


follows: 


go. 
2yu 


We note that the precision depends: 


a. On the speed of the radiated particles (as the speed increases the precision is 
reduced, which is reasonable and expected) 

b. From the distance of the loop from the source (as the loop moves away from 
the source the precision increases, assuming that the signal of the source remains 
detectable and it not influenced from other interferences). 


The above analysis must be given to an industrial physicist who will explain it to 
the designing engineers and together will start the designing of the instrument. This 
activity involves the construction design, the development of the construction plans, 
the construction of the prototype, the evaluation of the prototype with reference to 
prototype sources or other similar reference instruments, the determination of its 
precision etc. When this procedure has been completed the project is passed on 
to the team of designing the appearance of the instrument and after cost analysis, 
market research it is possible (not certain, economics of a project is another story!) 
that the decision will be positive and instrument will appear in the market. 

Physics of the “introspection” is not possible today. We all must be actively 
involved in the process of economy and the development of society. Not for the 
society of things and that of virtual reality which prevails today but the society of 
people. However difficult and disappointing such an action may be for a traditional 
young physicist it is a necessity which has to be faced. 


524 13 The Electromagnetic Field 


13.19 The Systems of Units SI and Gauss 
in Electromagnetism 


The main systems of units which are used in the (non-industrial) applications of 
electromagnetism are the SI and the Gauss system. The use of two different systems 
differentiates the constants in Maxwell equations causing confusion as to which 
form corresponds to which system of units and how an equation given in one system 
can be taken over to the other. In this book we have used the SI system only, 
therefore this confusion is not possible. However it is possible that one will wish 
for some reason to write an equation in the Gauss system of units. In the present 
section we give simple rules how this can be done. 

There are two approaches. One, which is the most reliable, is to carry out the 
dimensional analysis of an equation and then apply the necessary “factors” as for 
example we do with the speed of light when we set c = | and then we add the 
required c’s at the end, so that the dimensions of all factors match properly. The 
second, and easier, method is to use the basic equations of electromagnetism and 
find the correspondence in the two systems between the fundamental quantities. 
Because this correspondence is independent of the equation used to derive it, it 
must be applicable to all equations therefore one can use it to transfer any equation 
from one system to the other by transferring term by term. We note that the 
correspondence between the electric and the magnetic quantities is done by the 
fundamental relation egu09 = a 

We shall work with the second method and start with the correspondence of the 
quantity €9. For this we consider the Coulomb Law, which in the two systems of 
units has the form: 


SI System: F= ; Q*. 
Arey r2 
OF, 
Gauss System: F= —r. 


It follows that in order to write an expression involving €9 from SI to the Gauss 
system we must set: 


1 
ac —. 
a An 


The transformation of the electric field we find from the equation F = QE. This 
relation is identical in both systems of units therefore the correspondence is: 


E<E. 


The correspondence of the magnetic field H and the electric induction D we find 
from Ampére’s Law. We have: 


13.19 The Systems of Units SI and Gauss in Electromagnetism 525 


. oD 
SI system: VS 
4x, dD 
Gauss system: Vx H= —j+ — 
c cot 
therefore: 
4 
“HeH 
Cc 
4rD <— D. 


The correspondence between the other basic physical quantities we compute 
form the catastatic equations: 


SI system: D=ce09E+P; B = woH+ uwoM 
Gauss system: D=E+ 4zP; B=H-+ 407M. 


The first gives: 


and the second: 
cB<B 


where we have used the relation €9 49 = >. 
The correspondence for the vector potential A we compute form its definition 
A= V xB, from which follows: 


cA <A. 


Similarly for the scalar potential we find (@¢ = —VE): 
b <> ¢. 


For easy reference we collect these results in Table 13.1. 


Example 13.19.1 Inthe SI system the magnetic moment of a current j at the point r 
is defined as follows m = 5 {x x j. Write the corresponding equation in the Gauss 
system of units given that the energy W of the current j in an external field B is 
defined by the relation W = m- B. 


526 13 The Electromagnetic Field 


Table 13.1 Table for the transformation of equations between the SI and the Gauss system of units 


Quantity SI Gauss SI Gauss 
meg 22s 
£0 Ai e0 1 F= Tae 72P F= ad 
E E E F= QE F= QE 
4 + aD 4n+, 19D 
H =H H VxH=j+5 VxH==j+-5 
+, aD 4n+ , 10D 
D 4xD D Vx H=j+5, VxH=j+i7 
P P P D=eE+P D=E+47P 
B cB B B =u0H + uoM B=H+ 47M 
M M/c M B =u0H + woM B=H+ 47M 
A cA A A=VxB A=VxB 
p p p ¢=—VE ¢=—VE 
Solution 


Because the energy has the same units in both systems of units and because the 
correspondence of B is cB < B it follows that the correspondence for m is: 


—-m<-m. 
Cc 


Hence in the Gauss system the magnetic moment is defined with the relation m = 
x f[rxjdv. 


Example 13.19.2 Inthe SI system of units the magnetic induction of a coil of length 
1, small radius r and n turns per unit of length is given by the formula: 


L= pon lar. 


Write this formula in the Gauss system of units given that the inductance L is 
transformed as the quantity B. 
Solution 

In the SI system we have: 


Ar 


1 
L= — snl? =o nla’. 


E0C 


The correspondence of L is cL <— L therefore in the Gauss system this relation 
becomes: 


An 
cL>L= aTE in? 
c 


Chapter 14 M®) 
Relativistic Angular Momentum al 


14.1 Introduction 


In this chapter we continue our program of generalization of Newtonian physi- 
cal quantities to Special Relativity by considering the physical quantity angular 
momentum tensor. Since this quantity in Newtonian Physics it is described by 
an antisymmetric second order tensor it is necessary that we introduce new 
mathematical concepts and tools, the main one being the concept of a bivector. We 
shall also make use of the basics of the antisymmetric tensor analysis discussed in 
Sect. 13.10.1. The reader should consult this section before attempting to read the 
present chapter. 


14.2 Mathematical Preliminaries 


A bivector is any second order antisymmetric tensor Xgp = —Xpbqa. A bivector is 
called simple if it can be written in the form Xg, = AjqBy, where A“, BY are 
vectors. The dual bivector X*” of a bivector X@? is defined as follows: 


1 1 
yrab = 51 Xd Ss xe = all gs (14.1) 


Bivectors, being tensors, can be 1+3 decomposed wrt any timelike vector. 


© Springer Nature Switzerland AG 2019 527 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_14 


528 14 Relativistic Angular Momentum 
14.2.1 1+3 Decomposition of a Bivector X gp 


We consider a timelike vector field p* (p" pq = —p*, p” > 0) and the associated 
projection operator: 


1 
hap = Sab + paraee 


The 1+3 decomposition! of a general second order tensor Tz» in Minkowski space 
has been computed in (12.11) as follows: 


1 : 1 1 : 
Tap = pi (Fed PP) PaP = pita’ vd PD“) Pb = pi ie" Tea P) Pa ae ha’ he" Tea 


or in matrix form: 


gt (Tea PP") —53 (ho Tea p®) 


ip 
(14.2) 
an Tea P*) hg" hy* Ted 


This is a mathematical identity which expresses covariantly the tensor in terms 
of one scalar, two vectors and one second order tensor of order (2, 2) in the proper 
space of p“. In the special case that the tensor T,p = Xap where Xqp is a bivector, 
formula (14.2) simplifies. Indeed the term: 

Xap‘ p* = 0. (14.3) 


Next we define the vector: 
1 c d 1 b 
Eq= ee Xedpo = pF (14.4) 


This vector we call the electric part of the bivector X,,. Then the terms in (14.2), 
which contain once the projection tensor give: 


— Eqpp + Eppa. (14.5) 


Concerning the last term which contains twice the projection tensor, we define 
another vector H“ as follows: 


ha hy! Xca = —Nabea p° H4 (14.6) 


'This result is general and holds for the 1 + (n — 1) decomposition of a second order tensor. 


14.2 Mathematical Preliminaries 529 


and have the final 1+3 decomposition of Xqp along p*: 
Xab = —Eapy + Eppa — Nabed pH". (14.7) 


The vector H% we call the magnetic part of the bivector Xg». We emphasize 
that (14.7) is a mathematical identity, therefore all information of Xqp is contained? 
in the pair of vector fields E“, H“. 

In order to compute the vector H, in terms of the bivector Xq, we contract (14.6) 
with nabca and get: 


Nabed P’X4 = nabea P”[— Ep? + E% p* — n°4"* p, Hs] 
= —nabean®”* p? py Hs = 
= 2[878% — 658% |p p, H, 
= —2(p° pe) Ha = 2p” Ha 


from which follows: 


1 
= Zpzabea POX, p >0. (14.8) 


We note that both vectors E“, H“are spacelike: 
E* pg = H“ py = 0. (14.9) 


Let &, be the proper frame of p“ and suppose that in this frame E“ = (E LEE) 
and H¢ = (H!, H?, H3). Then in 2p the components of Xqp are: 


0 p?E! pe ge 
271 3 2 
pE’ 0 pH- —pH 
Xap] = 14.10 
[Xap] p2E2 —pH3 0 pH! ( ) 


p’E* pH? —pH!' 0 = 
Dp 


Finally we compute the invariants of the bivector Xz» in terms of the vectors 
E*%, H®. We have: 


2X = —XapX" = —[-Eapb + EbPa — Nabed P° HE" p? + E° p* — n°? pe Hal 


= —2(E* Eq)(—p”) = Nancy O° mH” He 


2Tn an—dimensional space the vectors E“, H“ have n— 1components each hence the bivector Xap 
has 2(n — 1) independent components. Compare with the electromagnetic field tensor Fz» which 
is also a bivector. 


530 14 Relativistic Angular Momentum 


= 2p? E* + 2082785 — 8782) P° Pm HH" Ay 
= 2p*E* — 2p*H?* = 2p"(E* — H”) (14.11) 
and: 
Y= Hapea kX" 
= Nabea|—E“ p” oe E? py" _ °°" p, Hs \[—E* p4 le EA p* _ joy, Hy] 
= 2Nabed a E*p? pm An + 2Nabed ea pr Hs Ep? 
= —4(6"52 — 66) E* p? pm Hy — 4(6255, — 655%) n°2™ p, Hy E* p* 


= —8p*(E“ Hy) (14.12) 
Therefore: 

X= = 5XanX® = p’(E* — H”) (14.13) 

Y= — Faded XX =p (B"H). (14.14) 


Exercise 14.2.1 Consider the bivector Fap of the electromagnetic field tensor and 
let p° = u", where u“ is the four-velocity of the observer. Show that the electric 
part and the magnetic part of this bivector defined in (14.4) and (14.8) as well 
as the invariants defined in (14.11) and (14.12) respectively, coincide with the 
corresponding quantities considered in Sect. 13.1. 


14.3 The Derivative of the Bivector X,,; along the Vector p“ 


Let s be an affine parameter along the trajectory of a particle and let u4 = G- = 7 
be the four-velocity of the particle. We find from (14.7): 


dX ab 
ds 


= Kah — —EaPp = EaPa + EyPa + Ep Pa = Aca A" a haea pH" 


where a dot over a symbol indicates derivation wrt s. The tensor eo is also 
a bivector, therefore it can be decomposed as above in terms of a new pair of 
spacelike vectors (e%, h“). We compute the electric and the magnetic parts assuming 
PaPa = Oi.e. the proper mass is constant. We find using (14.4) and (14.8): 


1. . 1 
ef = pik Ph — hE, = pan PbPeHa (14.15) 


14.3. The Derivative of the Bivector Xqp along the Vector p“ 531 


Also: 


he = gece 


v 
> 
> 
g 


1 . : : : : ; 
= yn pp (—Ecpa — Ecpa + Eape + EaPe — Nedrs PH” ra Nears p’ H°) 


1 
= zp "Pb (2Ea Be = Nedrs p’ H° = Hed E") 


1 abcd 


> aia 
2p" Pb2Ed Pe — 


1 : 
1 Dp nears P’ HS — 0? ponears p’ A’. 


2p 2p2" 
The terms: 
0° Do nears P’ H® = —2(545? — 675%) ppp’ H* = 0 
ort 


PoNcars p’ HS = —2(828° — 8°59) ppp’ HS = —2p"ht HH? 


therefore finally: 
. 1 
Maier + 2 9°4 pp peEa- (14.16) 


We consider now the orthonormal frame* {u“, N(p)“} where u“ = p“/|p| and 
N(p)*N(p)y = 65 where the index p counts vectors. In this frame we set: 


3 3 
“= ppN(p)*, H* = (1)? p34 (0) (14.17) 


p=1 
where Pp, P3+» are components. Then the derivative of E“, H@ along uw“ is: 


3 3 
= >> (BoN(p)* + PpN(p)"), H® = D\(-1? (Bsr pN(p)" + P3tpN(p)*) 
= 


p=1 
(14.18) 


that is, they are expressed in terms of the derivatives N(p)®. Let us assume in this 
frame that the vectors N(¢)* are propagated along the particle trajectory according 


3The calculations are general and hold for n dimensions. 


532 14 Relativistic Angular Momentum 


to the ‘law’: 


3 


N(p)* = Shu“ + > SEN(u)*. (14.19) 
p=!) 
Then we obtain: 
3 3 
E* = S° | pyN(p)* + pp | Sout + D> SPN(u)* 
p=1 pel 
3 
= >] ppSput + | PSK + pp d_ SK] N(u)4 (14.20) 
p=1 p=l 
3 3 
H? = S0(-1? | p34 p Shu" + | 34054 + prt > SH | N(u)* |. (14.21) 
p=1 p=1 


These relations give us the propagation equation of the vectors E%, H% 
associated with the bivector Xyp relative to the unit 4-vector uw“. The transport law 
of the frame {u“, N“} is general and can be established in practice by means of a 
“parallel” transport law of the frame {u%, N(o)“}, that is by a derivation.* If we 
replace the expressions (14.20) and (14.21) into (14.15) and (14.16) we find: 


3 S 
e* =| >) Bpd4 + pp >, SH — xin” pay | NC) (14.22) 
p=1 p=1 
3 3 
h® =| SOD? | 3408 + prep > SE | tein?’ py | N(u)* (14.23) 
p=1 p=1 


where x; is the first (principal) normal of the trajectory defined by the equation: 


d a 
| all EY ry (14.24) 


ds 


Kinematically «, is the length of the four-acceleration and geometrically the inverse 
of the radius of curvature of the orbit at the point where it is computed. 


4B.g. a Fermi propagated frame. 


14.4 The Angular Momentum in Special Relativity 533 
14.4 The Angular Momentum in Special Relativity 


Having presented the basics of the theory of bivectors we are in a position to 
proceed with the generalization of the concept of Newtonian angular momentum in 
Special Relativity. The relativistic form of this concept is necessary because angular 
momentum is a fundamental quantity of Newtonian Physics and because, as it will 
be seen, leads to the important concept of spin, which is a purely relativistic physical 
quantity with no Newtonian analogue. 


14.4.1 The Angular Momentum in Newtonian Theory 


The Newtonian angular momentum of a particle with linear momentum p“ with 
reference at a point with position vector a” is the (0, 2) tensor /,,, defined as follows: 


luv = (Xp — Ay) Pv — PulXy — ay). (14.25) 


We note that /,,, is the same for all points with position vector a, + kx, (keR). 
Definition (14.25) can be written as follows: 


Iy(a) = Nuvp(r” = a’ )p? (14.26) 


where /? is a 1-form or pseudo vector with components the three components 
112,113, [23 of the antisymmetric tensor /,,, of angular momentum. 
In the following we consider the angular momentum wrt the origin (i.e. we take 
the point with a, = 0). Then relations (14.25) and (14.26) read: 
liv = Nuvpl” (14.27) 
Lu = vor’ p?. (14.28) 


In 3-vector notation we write /° = | and it is easy to show that I can be written as 
the cross product: 


l=rxp (14.29) 


where r, p are the position vector and the linear momentum of the particle. 
Newton’s Second Law gives: 


Bigs 
dt 


me Ae oe ee (14.30) 


534 14 Relativistic Angular Momentum 


where f" = ap" is the 3-force on the particle. The 3-vector form of this formula is: 


dl 
—= f. 14.31 
dt ae ( ) 


The bivector: 
Muv =Xufo — fuxv (14.32) 


we call the net moment or the net torque of the force acting on the particle. It can 
be represented by a 1-form M? according to the formula: 


Mii = Nap Me. (14.33) 
Then the equation of motion of the angular momentum reads: 


dl yy 
dt 


= Mw (14.34) 


and in terms of the corresponding 1-forms: 


dl? 
Nuvo (me = =) =0 


Contracting with n"”° we find: 


_ dl? 


M? = —. 
dt 


(14.35) 

The 3-vector notation of the above is as follows. If F is the 3-force on the particle, 
Newton’s Second Law gives for the moment of force, or the pseudovector (=1-form) 
of torque about the point with position vector a: 


dp d dL 
N(a) = (r—a) x f=rx——(a =—(r —L(a) =— -Lia 
(a) = (r—a) x nee (a x f) Fe (a) Th (a) 
where L = r x p is the angular momentum of the particle wrt the origin, which 
we call the net angular momentum. When we take the point to be the origin, then 
L(a) = 0 and the net angular momentum L is related to the net torque N as follows: 


dL 
N= 


=—_— 14.36 
a ( ) 


14.4 The Angular Momentum in Special Relativity 535 


From this relation follows that if the net torque vanishes, the net angular 
momentum remains constant. This result is known as the conservation of angular 
momentum. 


14.4.2 The Angular Momentum of a Particle in Special 
Relativity 


Before we proceed, we remark that in the generalization of the angular momentum 
in Special Relativity we have a new situation. Indeed up to now all the physical 
quantities we have considered and generalized were vectors (e.g. the velocity, 
the acceleration, the momentum etc.) whereas now we generalize a bivector or 
equivalently a pseudovector. This means that we do not have anymore the rather 
easy physical intuition of the vector quantities and we must relay more on the 
mathematical manipulations and ‘similarities’, rather than on ‘plausible’ physical 
grounds. Because of this we must be prepared to meet ‘strange’ situations in the 
sense that we may end up with relativistic physical quantities with no Newtonian 
analogue. However this is not news. Indeed, we recall that the four-velocity in 
the proper frame of the particle has one component only, the quantity c, a purely 
relativistic quantity with no Newtonian analogue. A more drastic situation is the 
case with the four-acceleration which in the proper frame was defined solely by 
the proper acceleration a? again without Newtonian analogue. In both these cases, 
we postulated the physical nature of the new relativistic quantities. Therefore, it is 
reasonable to expect that in the proper frame of the particle, the angular momentum 
is possible to be reduced to an antisymmetric (0,2) tensor with no Newtonian 
analogue, whose physical significance will have to be postulated. 

Let us consider a particle with position four-vector x“ and four-momentum p* 
and let a spacetime point with position four-vector A“. We define the relativistic 
angular momentum of the particle wrt the point A“ to be the bivector: 


Lap(A) = (ta — Aa) Pb — Pa(Xp — Ab) = Nabea(x* — A°)p*. (14.37) 
The bivector: 
Lab = XaPb — PaXb = NabeaX p*. (14.38) 
we call the net angular momentum of the particle. We have: 
Lap(A) = Lab — (AaPb — PaAd) (14.39) 
that is Lgp(A) equals Lg, minus the constant term Ag pp — PpagAp. This formal 


definition takes us to a situation analogue to that of Newtonian Physics, where the 
angular momentum depends on the point at which it is defined. Note that if A@ 


536 14 Relativistic Angular Momentum 


is replaced with A*% + kp® then L,g,(A) does not change. This means that if the 
reference point ‘moves’ along the worldline of the instantaneous inertial observer 
of the particle, the angular momentum L,;(A) does not change. In the following we 
consider the angular momentum wrt the origin (i.e. we take A* = 0) and we discuss 
the net relativistic angular momentum. 

We compute the electric and magnetic 4-vectors associated with the bivector Lap 
in its 1+3 decomposition wrt the four-momentum p%. For the electric part we have 
from (14.4): 


1 1 1 
E,g= ha‘ Leap“ = sha‘ (Xe Pa _ pean = Ss (=p ha’ xe = —ha‘X¢ 
P P P 
(14.40) 


that is, E“ is the spatial part of the position vector x“. Concerning the magnetic part 
we find using (14.8): 


Hy = paved PPL = —— napea PC pt — px") =0. (14.41) 
P 2p 
We conclude that the angular momentum tensor with reference to the four- 
momentum p“ has “electric” part only and this equals —h,°x,. It is remarkable 
that the mass does not enter into the vector fields defined by the four-momentum. 
Concerning the invariants of Lz, we compute using the general formulae (14.13) 
and (14.14): 


X=-p’E*, Y=0. (14.42) 


Next we consider the propagation of the net angular momentum along the world 
line of the particle, that is the derivative dhab = Lab: This is also a bivector whose 
electric and magnetic parts are given by (14.15) and (14.16) respectively. Assuming 
that the proper mass of the particle is constant (i.e. the particle does not radiate) we 


compute: 


: d : 
e"(L) = ht E? = he (hxc) = —h® (ho xe + hSuc) 


d 1 1 
ne E («: + 2 u‘us) a = yh ius + uty) X¢ 


1 
= —u" (uoxe) = = 5 luxe)? (14.43) 
MC 
a 1 abcd : 1 abcd e 1 abcd 
h*(L) = Pu PoPcEa = —Fa2" Polk chgxe = aa! PoF Xa 
1 
= uy Fexd (14.44) 


mc 


14.4 The Angular Momentum in Special Relativity 537 


where u“ is the four-velocity of the particle and we have applied Newton’s 


generalized Second Law to write F* = ap = p*. We note that both e”, h® are 


spacelike four-vectors: 
et = hie, h? = hon, (14.45) 
We define the (net relativistic) torque tensor of the four-force F“ by the formula: 
Me _ xf Fe cz. F%x?. 


This is also a bivector, hence it has an ‘electric’ and a ‘magnetic’ part. We compute: 


1 
(x? pp) F* = —— 5 (0? up) F4 


1 
a aynged 
E°(M) = iM Pa=- Ta 


1 
ey (Xe Fea — xqaF) 


1 
H*(M) = 5" ps (acFa — Fox) = 
2mc 


2p 


1 
= 1 px Fa. 
mc 


We note that E?(M) = e@(L) and H“(M) = h“(L) therefore: 


dL 
7 = ae (14.47) 


This is the equation of propagation (i.e. equation of motion) of the (relativistic) 
angular momentum Lg, of the particle. 

We compute the invariant u°x, in the proper frame of the particle. In that frame 
x8 = (F) n+ #7 = (9) y+hence ux, = —c*t. Therefore: 


Au) = _ Fe. (14.48) 


5This equation is computed directly as follows: 


dL” dx" , 
dt dt 


— yee (14.46) 


538 14 Relativistic Angular Momentum 


The components of the angular momentum in the frame of a LCF & in which 


a _ (ct a—_ (yme é 
xe = (Fi) ye PY = (Svs) ate Computed as follows”: 


is t 
ey es myc _ (myc ct 
wre = (Fe), © (myer) (myue)s °C), 


c?tmyc MYCTy c?tmyc myctvy 
myctv” myr,v” mycr®  myvyr® 
0 myc(tUy — rp) 


—myc(tv’ —r’) my(r¥vy — v'ry) 


i.e. the components of Lz» in the frame & are: 


L™ =0 (14.49) 

L°’ = myc(tv" — r”) (14.50) 

L! = —myc(tv — r#) = —L% (14.51) 

LY = my(r¥v” — ver?) = mye” (14.52) 

where €“” = (ry — vr”) is identified with the net Newtonian angular 
momentum. 

We note that for a particle at rest 2“” = 0, but L°” = —mcr” ¥ O i.e. the four- 


angular momentum does not vanish. However in the proper frame of the particle 
— and only there because only there r“ = 0 — both Lay = 0 and M®”? = 0. We 
conclude that it is possible to define the proper frame by the condition: 


Lapp’ = 0. (14.53) 


This observation is important and will be used subsequently. 


®t stands for transpose. Note the method we use to compute the components by the tensor product 
of the corresponding matrices. 


14.5 The Intrinsic Angular Momentum: The Spin Vector 539 
14.5 The Intrinsic Angular Momentum: The Spin Vector 


The angular momentum we considered in Sect. 14.4.2 is due to the motion of the 
particle, that is, it is of a kinematic nature. For this reason we call it the orbital 
angular momentum, mainly because originally was used in the early model of 
the atom to study the motion of the rotating electrons around the nucleus. Its 
main characteristic is that it vanishes at the proper frame of the particle. Physical 
observations have shown that this type of angular momentum is not sufficient to 
cover the physical phenomena and one has to consider an additional type of angular 
momentum, however of a dynamic nature. The sum of the orbital and the dynamic 
angular momentum make the total angular momentum of the particle, which is the 
quantity one should consider in the study of motion of a particle in a magnetic field. 

It is to be noted that the definition of intrinsic angular momentum (and con- 
sequently the spin) given below, does not apply to the photon and the neutrino, 
because both do not have proper frame. However it is possible to define spin for 
these particles by a limiting process (m — Q). It turns out that for these particles 
the spin is either parallel or antiparallel to the 3-velocity in all frames.’ 


14.5.1 The Magnetic Dipole 


To obtain a feeling of the Physics of the ‘dynamic’ angular momentum, we discuss 
briefly some well known experiments of classical electromagnetism. It is well 
known that when an electric current i moves in a magnetic field B, it suffers a force 
F = i { dl x B where dl is a differential element of length along the conductor 
(more general along the path of the current). Consider a rectangular loop ABCD 
of wire of length AB = CD =a and width BC = DA = b which is placed 
in a uniform magnetic field B, so that the plane of the loop is always normal to 
the direction of the magnetic field (see Fig. 14.1a). The current is provided into the 
loop by a pair of wires which are twisted tightly together so that there will be no 
net magnetic force on the twisted pair, because the currents in the two wires are 
in opposite directions. Thus the lead wires may be ignored. The loop is suspended 
from a long inextensible string at its center of mass, so that it is free to turn, at least 
through a small angle. 

The net force on the loop is the resultant of the forces on the four sides of the loop. 
Let us determine the force on side CD (see Fig. 14.1b). On the side CD the vector 
dl points in the direction of the current and has magnitude b. The angle between 
CD and B is 90 — 6, hence the magnitude of the force on this side is: 


Fcp = ibBcos@ 


7See E.P.Wigner Rev Mod Phys. (1957) 29, 255. 


540 14 Relativistic Angular Momentum 


mlR xX xX x 


x xX 


Fig. 14.1 Rectangular coil carrying current in a uniform magnetic field 


and its direction is out of the plane of Fig. 14.1b. Working in a similar way, we 
show that the force on the opposite side AB has the same magnitude Fag = Fcp 
and points in the opposite direction to Fcp. Thus Fcp + Fag = 0 and these 
two forces taken together have no effect on the motion of the loop. The remaining 
forces Fgc, Fp, have equal magnitude iaB and opposite directions, but they have 
different line of action. As a result the total force Fgc + Fp, = 0 but they produce 
a net torque which tends to rotate the loop about the axis xx’ as shown in Fig. 14. La. 
This torque can be represented with a vector pointing along the xx’ axis form right 
to left in Fig. 14.1a. The magnitude of this torque, t’ say, equals twice the torque 
caused by F gc that is: 


b 
t' = 2(iaB) (3) sind = iabB sind = iSB sin@é (14.54) 


where S = ab is the area of the loop. It can be shown that this result holds for 
all plane loops of area S, whether they are rectangular or not. If we have N loops 
together, as in a coil, then the total torque on the coil is Te9;1 = Nt’ =iNSB sind. 
The quantity: 


pw =iNSe (14.55) 


where e is the unit vector along the direction xx’we call the magnetic dipole 
moment of the coil and the coil itself we call a magnetic dipole. In general by a 
magnetic dipole we understand any structure which interacts with the magnetic 
field and this interaction is characterized by the magnetic dipole moment of the 
magnetic dipole and the magnetic field B, producing the torque: 


T = “xB. (14.56) 


14.5 The Intrinsic Angular Momentum: The Spin Vector 541 


When a magnetic dipole is placed in a magnetic field its orientation changes so 
that a work (positive or negative) must be done by an external agent to restore the 
orientation of the magnetic dipole. Thus the magnetic dipole has potential energy 
U associated with its orientation in an external magnetic field. This energy may be 
taken to vanish for any arbitrary initial position of the magnetic dipole. If we assume 
that the potential energy vanishes when yw and B are at right angles (that is, when 
@ = 1/2) in (14.54), then it can be shown that: 


U=—ps. (14.57) 


The conclusion from the above considerations is that a magnetic dipole acquires 
a torque in its center of mass which is of no kinematical character but it is due to its 
interaction with the magnetic field. This ‘dynamic’ torque gives rise to an angular 
momentum, which we call the intrinsic angular momentum or spin angular 
momentum of the magnetic dipole. This angular momentum must be added to the 
kinematic angular momentum of the magnetic dipole (e.g. the particle) to make up 
the total angular momentum which modulates the motion of the magnetic dipole in 
a magnetic field. 


Example 14.5.1 Consider the Bohr model of the atom of hydrogen in which the 
electron circulates in a circular path of radius r around the nucleus. This may be 
considered as a tiny current loop so that the atom itself is a magnetic dipole. The 
magnetic dipole moment of this atom we call orbital magnetic dipole moment and 
denote with jz. Derive a relation between ju; and the orbital angular momentum L, 
of the electron. Compute jw; if r = 5,1 x 10-''m and the ratio < = 1,76 x 
10!'Cb/Kg. 
Solution 

The force on the electron due to the charge of the nucleus is the Coulomb 
force F = k& where e is the charge of the electron (and the nucleus) and k is a 
constant depending on the system of units. This force is a centrifugal force, therefore 


F = ™\ where m is the mass of the electron and v its speed. Equating the two 
expressions of the force we find: 


The angular velocity of rotation is: 


542 14 Relativistic Angular Momentum 


The current produced by the rotation of the electron of charge e is the rate at which 
it passes through any given point of the orbit, hence: 


. (2) e k 
iz=ev=e = | . 
2n 2x V mr3 


The orbital dipole moment ju; is given by (14.55) if we put N = | and A = rr’, that 


1S: 
elk 2 e [kr (14.58) 
= TT => - 
ine 2a V mr3 " 2Vm 


The orbital angular momentum L, of the electron is: 


ke2 2m 
L “ Bes 
} = mur = mor’ =m} —sr° = — py. 
mr e 


which shows that the orbital angular momentum of the electron is proportional to 
the magnetic dipole moment. 
Introducing the data in (14.58) and taking k = 


py = 9,1 x 10-74A m?. 


1 
Ameo 


(MKS system) we compute 


It is an experimental fact that the elementary particles are magnetic dipoles, that 
is, they have an intrinsic angular momentum. Originally this was confirmed for 
the electron and then it was established for all other particles. This means that an 
electron in its proper frame creates an electric field due to its charge and a magnetic 
field due to its magnetic dipole moment. The field lines of these two fields are shown 
in Fig. 14.2 where it is also shown the intrinsic angular momentum L;. The fact that 
the elementary particles are magnetic dipoles, and not simply charged or neutral 


E 
L3 
(a) (b) 


Fig. 14.2 Electric and magnetic field lines of electron 


14.5 The Intrinsic Angular Momentum: The Spin Vector 543 


units of mass, indicates that they consist of ‘smaller’ more ‘elementary’ particles 
in the same way the atom is a magnetic dipole due to the rotation of the rotating 
electron. This is true even for the particles with zero intrinsic angular momentum, 
in the sense that the parts it consists of, cancel the effects of each other in overall. 
The following exercise could be an extreme mechanistic physical explanation of the 
magnetic dipole moment of the electron. 


Exercise 14.5.1 Assume that the electron is a small sphere of radius R, its charge 
and mass being distributed uniformly throughout its volume. It has been measured 
that such an electron has an intrinsic angular momentum L; = 0.53x 10~*4 Joule s 
and a magnetic dipole moment 4; = 9, 1 x 10~*4.A m?. Show that the ratio e/m = 
2u1/L 1. To justify this result divide the spherical electron into infinitesimal current 
loops and find an expression for the magnetic dipole moment by integration. 


14.5.2 The Relativistic Spin 


In the last section we have shown that besides the kinematic angular momentum 
bivector and the torque tensor which correspond directly to the relevant concepts of 
Newtonian theory, and both vanish in the particle’s proper frame, there is another 
angular momentum of non-kinematic nature which must be taken into account. 
Although the appropriate place to discuss this topic is Quantum Electrodynamics,® 
in the following we shall attempt a classic treatment which we think it has a physical 
value. 
The question we have to answer is: 


How one could incorporate the two types of angular momentum, kinematic and dynamic, 
in one, the total angular momentum? 


The answer to this question is necessary because experiment has shown that the 
elementary particles are magnetic dipoles, therefore their motion in magnetic fields 
(which is a routine in experimental Physics) will be modulated by the total angular 
momentum and not by the orbital angular momentum alone. 

Looking for the answer we note that the orbital angular momentum bivector Lap 
has only electric part. Therefore if we add a bivector Sap to Lap, which has only 
magnetic part then we have the total angular momentum while we preserve the 


8The reader may wonder why we bother to discuss the concept of spin within the limits of the 
non-quantum theory when we know that only a quantal description can be correct. The answer lies 
in the quantal theorem which states that the classical equation of motion of a dynamical variable 
is the quantal equation of motion of the mean value of that variable averaged over an ensemble of 
identical systems. Therefore the conclusions we shall draw with the classical treatment will apply 
to averages over many identical particles prepared in the same way, like the electrons or muons in 
a beam or the valence electrons in a gas of atoms in a glow tube. 


544 14 Relativistic Angular Momentum 


kinematic and the dynamic characters apart. The requirement that the electric part 
E“(S) of Sap vanishes is (use (14.4) to see this): 


Sapp? =0 & Su’ =0 (14.59) 
where we have introduced the spin vector: 
St = 5a ueSea (14.60) 
or equivalently: 
Sab = Nabed S°u" (14.61) 


Concerning the magnetic part H“(S) of Sap from (14.8) we find: 


l 1 
H“(S) = 0 pp Sca = —S" (14.62) 
2p m 


The length of the spin vector S“ is computed as follows: 


1 : 
o = 5? Sq = ga Uo Scatarsit a 


1 

= oy (886511 — hates + 8¢5¢ap — 85625! + 8/875 — 678537) up” Soa S* 
i 2 py. 1 b 

= a )SapS@ = 52 Sav S* : (14.63) 


The invariant S* we call the spin of the particle. It is independent of the mass and 
it is this quantity which is quantized in multiples of h/2. 

There still remains the propagation of the intrinsic angular momentum along 
the particle’s worldline, that is the quantity dab = Sp. This is a bivector 
whose irreducible parts are the vectors e“(S) and h“(S) computed in the general 
relations (14.15) and (14.16). Substituting in these relations E“(S) = 0, H“(S) = 
——, S¢ we find: 


2mc2 


14 1 hee ke 
e“(S) = pf p= zp," PbPcHa (14.64) 


: 1 . 
h?(S) = hg H? = — his” 


14.5. The Intrinsic Angular Momentum: The Spin Vector 545 
From (14.7) we have then: 


Sab = —€a(S) pp + eb(S) Pa — Nabea Poh" (S) = —€a(S) pp + €4(S) Pa — Nadeau’ 84. 
(14.65) 


This is as far as one can go with the mathematics. Physics will give an expression 
for the quantity Sap and there will result an equation of motion for the spin vector. 
Newton’s Second Law cannot be used because S,p is a non-Newtonian physical 
quantity, therefore Newtonian Physics cannot (and in fact need not) say anything 
about it. 

In order Physics to make a statement goes over to experiment and observation. 
Because Sz, is an angular momentum Sus must be a torque. We have seen that 
with each magnetic dipole there is associated a magnetic dipole moment mw, and if 
a magnetic dipole is placed in a magnetic field B it suffers a torque tT = pw x B 
(see (14.56)). Furthermore, experiment has shown that the elementary particles (in 
the standard sense of the term) behave as magnetic dipoles. These and the expected 
application of the theory of relativity to elementary particles lead us to relate the 
magnetic dipole moment with the magnetic part of the tensor Sap, and subsequently 
via (14.65), with the spin vector S“. 

Now, let tgp(S) be the torque tensor corresponding to the intrinsic angular 
momentum S,,.Then we assume the equation of motion: 


Sab = Tab(S). (14.66) 


The torque tensor has an ‘electric’ part (which vanishes) and a ‘magnetic’ part 
H(t) which equals the magnetic part of S,,, that is we have: 


i eee 
H(t) = h"(S) = —h?8?, 
m 
In analogy with the Newtonian result (14.56), we define H“(t) by the formula: 


1 
H"(t)= — up bicBa- (14.67) 


where j1;- is the magnetic dipole moment of the particle. Then the equation of 
motion of the spin vector is: 


nas? = 24 uy wie Ba (14.68) 
or, using (14.78) (see below): 
hs? = ot nt upSeBa (14.69) 


546 14 Relativistic Angular Momentum 


In the proper frame of the particle the equation of motion (14.69) is written: 


ds* 
Es gal ge 


—_ = B* 14.7 
dt 2mc . ( m) 


where the « besides a symbol indicates that the quantity is computed in the proper 
frame of the particle and the & above the equality sign indicates that the equation 
holds in the proper frame of the particle only. 

The equation of motion (14.69) specifies the spatial part nes? of S“. To find the 
equation of motion of S“ we write’: 


$4 = —(Suy)u? + ngs? (14.71) 
under the condition S“ug = 0. This equation can be written: 
S4 = (S$? itp)ut + h3S? (14.72) 


where w“ is the four-acceleration of the particle. Using Newton’s generalized second 
law we write u“ = F“/m where F“ is the (inertial) four-force acting on the particle 
and m is its mass. These give: 


: 1 
$? = —(S°F,)u? + etl nebed uy S.Bg 
m 2mc 
which is the general formula of the propagation of spin. 
We derive a formula for S“ when the particle moves in a homogenous electro- 
magnetic field! Fy. In this case: 


Fy = 4 Fupu? (14.73) 
Cc 
hence: 
Ga __ q b C\,,a Iq| abcd 
S* = —~(S° Fpeu")u* + g——n upSc Ba. (14.74) 
mc2 2mc 


We know that the magnetic field: 


1 
Bi = su Pea (14.75) 


°This is the 143 decomposition of S¢ wrt u“. Note that S“uq = 0 does not imply Sug = 0. 
10See (a) ‘Recession of the polarization of particles moving in a homogeneous electromagnetic 
field’ by V. Bargmann, L. Michel and V. L. Telegdi Phys Rev Letters 2 (1959), 435-436. (b) ‘Spin 
and orbital motions of a partticle in a homogeneous magnetic field’ by V. Henry and J. Silver Phys. 
Rev. 180 (1969), 1262-1263. 


14.5 The Intrinsic Angular Momentum: The Spin Vector 547 


Replacing in the second term of (14.74) we find: 


Iq| 
gz up Scmarsi! F 
ez! : 
eapey aa ~1)2 (36239 + 8P8054 + 5,586" ) upScnarsiu” F* 
= othe (ut re fy Pes ut Fa) upSc 
mc 
= ei Psa CFs, 
x lle pes, 4 gl phys 
Therefore: 
§4 = (9 F poll “yt +gtm Beg ang aim (FeuyS.)ue => 
(14.76) 
ga = IIB pracy 4 Un (S? Ficu)u". (14.77) 
m 2 mez \2— a 


The experiments which confirm the proportionality of the magnetic dipole 
moment and the change of the intrinsic angular momentum are called ‘gyromag- 
netic’ experiments. Relation (14.77) has been confirmed by such experiments on 
many different systems. The constant of proportionality is one of the parameters 
characterizing the particular system. It is normally specified by giving the gyro- 
magnetic ratio or g-factor, defined by the relation: 


us = gS (14.78) 
Cc 


where |qg|/m is the (measure of) charge to mass ratio of the particle. The first 
successful experiments to show this have been done by Einstein and de Hass as early 
as 1915 and later on (1935) by Barnett.!! It has been found that for the electron 


'l Binstein A and de Haas W. J. (1915) Verhandl. Deut.Phys. Ges. 17, 152; Barnett S.J. (1935), Rev. 
Mod. Phys. 7, 129. 


548 14 Relativistic Angular Momentum 


&-- = —2 and for the positron g,+ = 2. Pion has g, = 0. The magnetic dipole 
moment of the hydrogen atom results form both the electron orbital motion and the 
electron spin. These two interact and the value of the g-factor for the atom is between 
—1 (pure electron orbit) and —2 (pure electron spin). For electrons, positrons and 
muons experiment has given the following values: 


Se- = —2(1 + 1.1596 x 1074) 
oe S400 + 17K 10) 
8y- = —2(1 + 1.166 x 1073) 
8+ = +2(1 + 1.16 x 1074) 


therefore for all these cases we find the simple equation of motion for the spin 
vector: 


ga ~ 18 pracy, 
m 2 
The difference: 
a ® on 
2 Iq 


is called the magnetic moment anomaly. Note that the value of a is the same 
for electron and positron and this result holds in general for a particle and its 
antiparticle. This is a result of the ratio al and the opposite signs of g for each 
particle. 

From the general relation (14.77) we note that sa Sq = 0 = S* =constant. This 
result and S“uq = 0 show that the spin vector is a spacelike vector in the rest space 
of u“ which rotates about the origin of the proper frame of the particle. The rotation 


depends on the external magnetic field. 


14.5.3 Motion of a Particle with Spin in a Homogeneous 
Electromagnetic Field 


Consider a particle of mass m, charge g and spin vector S“ moving in a homo- 
geneous electromagnetic field F,,. If u“ is the four-velocity of the particle the 
four-force on the particle is qFapu? and the magnetic field is By = sNabeau? Fe, 
Newton’s second Law is the equation of motion of the four-velocity: 


mu? = q Fapu (14.79) 


14.5 The Intrinsic Angular Momentum: The Spin Vector 549 


and (14.77) is the equation of motion of the spin vector. Because the electromagnetic 
field is homogeneous, the four-acceleration is constant and equal to u% = 4 Fypu” : 
To find the motion of the spin vector we consider (14.77) in the proper frame of the 
particle where this equation is reduced to (14.70), which we write in the form: 


St an? So B® (14.80) 


where a = g jal and an * indicates a 3-vector in the proper frame of the particle. 
From the equation of motion (14.79) we have: 


S? = constant and S*" Be =0=> (S**Bi) =0 
therefore the angle ¢ between the 3-vectors S*”, B*” is constant in the proper frame 
of the particle. This implies that during the motion of the spin (not of the particle!) 


the vector S*” traces the surface of a right circular cone with axis along the magnetic 
field Bi with opening angle ¢. The solution of (14.80) is written as follows: 


S*"(r) = S* sin f(cos w* tet) + sin w* Te) + cos oS* ees (14.81) 


where €(1)5 €(2)> €(3)p 18 an orthonormal basis and Bi = B*e),,. To determine 


the angular speed w* we compute the derivative S*” and then use the equation of 
motion (14.80). We find: 


o* (- sin o* tel) + cos w* relly) = an"? (cos w* te) + sin w* Te) B*ers). 
Taking = 1 we find easily that @* = aB*, hence: 
a = —aB*ery. (14.82) 
Therefore the solution of the equation of motion in the proper frame is: 
S** = §S* sin f(e(1), cosaB*t + ey, sinaB*t) + cos ¢S*e3y,,. (14.83) 


This represents a regular precession in which the spin vector traces out a right 


circular cone with the direction of the magnetic field as axis and constant angular 
_ |q\B* 
mc 


velocity w* = § \q|B" The quantity w* = is the cyclotronic frequency!” in 


2 mc 


the proper frame of the particle (see Fig. 14.3). 


This is the angular speed of a particle of charge g which is introduced at right angles to a uniform 
magnetic field of magnetic induction B*. 


550 14 Relativistic Angular Momentum 


Fig. 14.3 Spin procession in 
a uniform magnetic field a 


14.5.4 Transformation of Motion in & 


In order to find the motion of spin in another coordinate frame, the & say, we have 
to apply the appropriate Lorentz transformation to the various quantities involved. 
However this is not enough. Indeed in the proper frame the spin precesses around the 
magnetic field with angular velocity #* = —$@%* whereas the particle accelerates 
(i.e. at 4 0) as it moves. This means that the spatial directions in the proper frame 
of the particle suffer the Thomas rotation which (in &!) is given by the angular 
velocity ar = — why x a, where u, a are the velocity and the acceleration of the 
particle in & (see (14.69)). Therefore in & the spin vector executes two independent 
rotational motions with angular velocity w* and w7 the net angular velocity being 
the composition of the two angular velocities. Let us compute the motion of the spin 
ind. 

Choose the coordinates in & so that the z—axis is along the direction of the 
homogeneous magnetic field B and assume that the initial velocity of the particle 
is normal to the magnetic field so that the motion takes place in the plane x, y 
with basis vectors e;,e2. The electromagnetic field induced in the proper frame 
(=local rest frame!) of the particle at each position along its trajectory is given by 
the transformation formulae: 


EX =E) +y,(E, tux B) = yuBer (14.84) 


1 
BY = By + yx [Bs — ux B| = —y, Bes (14.85) 


where u = ue, is the velocity of the particle in &. We see that in the proper frame 
the direction of the magnetic field is along the z—axis whereas the electric field is 
uniform and along the direction of the radius. The force due to the electric field is: 


F* = gE* = qy,uBer (14.86) 


14.5. The Intrinsic Angular Momentum: The Spin Vector 551 


and it is a centripetal force (otherwise the particle would not rotate!*) with 
acceleration: 


a*=aru. (14.87) 
From the above we find: 
\q|B 
of = yy, 2 = pon (14.88) 
mc 


where wo = al 8 is the Newtonian cyclotronic angular speed in ©. Therefore in the 


proper frame we have that the spin precesses with angular velocity: 


* 


a= —F yucooes. (14.89) 
From the transformation of the angular velocity!* we have that in D: 
& 
o= — 7083. (14.90) 


Concerning the Thomas precession we have shown in (6.71) that for a circular 
motion: 


or = —(Yu — laces (14.91) 
where q, is the angular (i.e. the cyclotronic) speed of rotation given by: 


_ iqB 1 
= = —wo. (14.92) 
Yule Yu 


Wc 


Eventually we have that in © the spin precesses around the direction of B with total 
angular velocity: 


1 
i= neues -[£+ ii —| Ages. (14.93) 


u 


'3We can always make that force centripetal by changing the direction of the speed u. 
'4Note that the angular velocity equals 27/T therefore its transformation is like T~!, that is: 


* 
oOo = Yu 


where w is the angular speed in © and w* is the angular speed in &*. 


552 14 Relativistic Angular Momentum 


One important (and well known) conclusion is that the difference: 


ee a [5 = 1] tne: (14.94) 


This is independent of the speed of the particle and, furthermore, it is possible to be 
measured directly. This fact facilitates the measurement of the factor g because one 
can use beams of the same particles with different initial speeds. 

In case the velocity of the particles is not normal to the magnetic field, we 
decompose it in two components one parallel to the field and one normal to it. 
The above considerations hold for the normal component. Concerning the parallel 
component this stays constant (the magnetic field has no effect parallel to its 
direction) hence the motion parallel to the magnetic field is with uniform velocity. 
In conclusion the motion of the particle (not the spin!) is the combination of two 
motions: One motion with uniform velocity (drifting) parallel to the direction of 
B and a planar circular motion with uniform angular velocity ws with axis along 
the direction of B. The combination of the two motions results in a helical motion 
with axis along the direction of B. This means that the frame & we considered is 
drifting along the direction of B with constant speed uj), therefore in order to find 
the motion in the original frame, the &’ say, we have to apply the appropriate boost 
along the z—axis. This means that the circular frequencies we have found above 


must be multiplied with the factor oe For example in &’ the spin precesses with 
i 
angular velocity: 


i & a ee 
Os = +1 @0e3. (14.95) 


Chapter 15 ®) 
The Covariant Lorentz Transformation om 


15.1 Introduction 


It can hardly be emphasized that one of the most important elements of Special 
Relativity is the Lorentz transformation. This is the reason we have spent so much 
space and effort to derive and study the Lorentz transformation in the early chapters 
of the book. One could naturally ask “After all these different derivations of the 
Lorentz transformation why we are not yet finished with it?” The reason is the 
following. Special Relativity is a geometric theory of Physics, which can be written 
and studied covariantly in terms of Lorentz tensors (four vectors etc.) without the 
need to consider a coordinate system until the very end, when one has to compute 
explicitly the components of the physical quantities of a problem for some observer. 

All derivations of the Lorentz transformation so far used either coordinates 
or 3-vectors, that is, they were not covariant.! This does not mean that these 
transformations are not the general ones or that they are insufficient to deal 
with all relativistic problems. The point is that, although the covariant Lorentz 
transformation is not necessary for the development and the application of the 
theory, in cases where one has to deal with general problems of a qualitative 
character or with involved problems, the covariant formalism significantly simplifies 
the calculations because one is able to apply geometric techniques to get answers 
which would not be feasible to get with the standard coordinate or vector form of 
the transformation. Furthermore if one describes a problem in covariant formalism 
then it is possible to use one of the well known algebraic computing programs to 
perform the calculations. This is of practical importance because it makes easy (or 
even possible) complex calculations, which would be unbearable to be carried out by 
hand and, most important, without mistakes. Finally we emphasize the aesthetic side 
of the matter and claim that because the Theory of Special Relativity is a geometric 


'The k— calculus is but it is very limited in its use for not introductory problems. 


© Springer Nature Switzerland AG 2019 553 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_15 


554 15 The Covariant Lorentz Transformation 


theory of motion, its description in any but the covariant formalism hides much of 
the power and the elegance of the theory. 

In this chapter we derive the covariant form of Lorentz transformation and we 
refer some of its applications. As it might be expected this chapter is of a more 
advanced level, therefore it is advisable to be studied after the reader has some 
experience with Special Relativity. 

Before we get into the details, let us take a quick promenade into the relevant 
literature. Based on the vector form of the Lorentz transformation Fahnline* gave the 
covariant expression of the (proper) Lorentz transformation and the corresponding 
covariant transformation of a four vector. He used this form of the transformation 
to compute the composition of successive Lorentz transformations in a brute but 
direct way. This problem is not easy. Indeed it is well known? that in order to 
compute the covariant form of the Euclidean rotation of this decomposition one 
has to invoke the homomorphism between the restricted Lorentz group SO(3,1) and 
SL(2,C). Fahnline showed that this is not necessary provided that the covariant form 
of the transformation is used. 

Finally a different and very interesting approach to the covariant Lorentz trans- 
formation has been given by Krause* who expressed the Lorentz transformation 
in terms of two unit timelike vectors u“, v“ corresponding to the four-velocities 
of inertial observers related by the transformation. He considered only the proper 
Lorentz transformation, whose covariant form he derived using the orthonormal 
tetrads associated with the observers u“, v“. He followed a similar technique 
developed by Basanski> who used null tetrads to express the Lorentz transformation 
in terms of spacetime rotations. The approach of Krause is more fundamental than 
that of Fahnline. 

Finally in a more recent work Jantzen et al.° have introduced a covariant form of 
Lorentz transformation based on relative velocity and the fact that under a Lorentz 
transformation between two observers the timelike plane defined by the 4-velocities 
of the observers as well as the normal 2-plane to that plane, are preserved. They 
obtained the proper Lorentz transformation (they call it “relative observer boost‘) 
and they discuss some of its properties, however again in a form which obscures the 
simple geometric significance of the Lorentz transformation. In a similar approach 
Urbantke’ studied the Lorentz transformation and considered some applications of 
the proper Lorentz transformation in terms of spacetime reflections. 


2 Fahnline D Am J Phys 50 (1982) 50 818-821. 


3See Halpern F. R. (1968) “Special Relativity and Quantum Mechanics” (Prentice Hall, Engle- 
wood Cliffs, NJ) and Goldstein Herbert “Classical Mechanics” Second Edition (1980) Addison — 
Wesley, Chap. 7. 


4 Krause J (1977) “Lorentz transformations as space-time reflections” J. Math. Physics 18,879- 
893. 


5 Basanski S L (1965) “Decomposition of the Lorentz Transformation Matrix into Skew — Symmetric 
Tensors” J. Math. Physics 6,1201—1202. 


6 Jantzen R, Capini P, Bini D: Ann Phys, 215, (1992) 1 and gr-qc/0106043. 
7Urbantke H K: Found. Phys. Lett., 16, (2003) 111. 


15.2 The Covariant Lorentz Transformation 555 


Before we present our approach we emphasize once more that the Lorentz 
transformation (in vector or in covariant form) is of a pure mathematical origin 
and has nothing to do with Special Relativity or any other theory of Physics. 
Its connection with Physics is done by the Principle of Special Relativity, which 
demands that all physical quantities in Special Relativity must be covariant under 
the Lorentz transformation (that is, they must be expressed in terms of Lorentz 
tensors). This point is justified by the fact that we are able to derive/study the Lorentz 
transformation based on geometric assumptions only, without making reference to 
any physical concepts or systems. 


15.2. The Covariant Lorentz Transformation 


In Chap. 1 we derived the Lorentz transformation L : M* —> M7? as an endo- 
morphism of Minkowski space satisfying the following two requirements: 


1. L is a linear transformation 

2. L is an isometry of M+ which in addition preservers the canonical form n = 
diag(—1, 1,1, 1) of the Lorentz metric, that is, it satisfies the equation L'y 
L=n. 


The derivation of the covariant Lorentz transformation will be based on geomet- 
ric assumptions which are hidden in the above requirements. Indeed the linearity of 
the transformation means that L preserves the linear geometric elements (straight 
lines, 2-planes and 3-planes or hyperplanes) in M*. The first part of the second 
assumption means that it also preserves the character of these linear elements, that 
is, a timelike straight line goes to a timelike straight line, a spacelike 2-plane to 
a spacelike 2-plane etc. Because a timelike straight line is characterized by its 
unit tangent vector, let uv“, v“ be two unit timelike vectors related by the Lorentz 
transformation. 

Then the preservation of the canonical form 7 of the Lorentz metric implies that 
L preserves the Euclidean planes normal to the four-vectors uv“, v“. Equivalently 
the Lorentz transformation relates the LCF associated with the four-vectors u%, v“. 
The above observations are sufficient in order to define the covariant form of the 
Lorentz transformation. 

In the following we shall need the projection tensors hap and pap we defined 
in (12.2) and (12.21) respectively. 


15.2.1 Definition of the Lorentz Transformation 


Definition 15.2.1 Let u“, v“ be two unit, non collinear, timelike vectors, corre- 
sponding to the four velocities of two (inertial) observers, which define a timelike 


556 15 The Covariant Lorentz Transformation 


2-plane. The (planar) Lorentz transformation defined by u“, v“ is the map® L : 
M* -> M?* specified by the following requirements: 


1. L is a linear transformation and preserves the timelike 2-plane spanned by the 
vectors u“, v and the spacelike 2-plane normal to the u“, v* 2-plane. 

2. L is an isometry of M* 

3. L is defined on an equal footing in terms of u“, v“ (this is the so called 
reciprocity principle’). This is formulated by the requirement that the inverse 
Lorentz transformation is the same with the direct but with u“, v“ interchanged. 


Let us write the verbal requirements of the definition in terms of equations. 
Requirement | implies for the timelike 2-plane spanned by the vectors u“, v% the 
equations: 


Lv* = Au“ + Bu“, Lu = Cu* + Dv“v (15.1) 


where A, B, C, D are quantities independent of the spacetime coordinates such that 
AB — CD # 0 (because u“, v% are not collinear) and AD 0 (otherwise the 
transformation is singular). 

For the spacelike 2-plane normal to that plane the same requirement implies the 
equation: 


Lpap(u, Vv) = Pab(u, v) (15.2) 


where pgp is the projection tensor which projects normal to the timelike plane of 
u“, v“ . Requirement 2 means that the inner products of vectors are preserved. For 
the two vectors u“, v this requirement implies the equations: 


(Lu*, Lu*) = (Lv*, Lv") =—-1, — (Lu*, Lv*) = (u*, v*). (15.3) 


Requirement 3 means that the inverse transformation applied to u“ has the same 
result as the direct transformation applied to the vector v“ and vice versa. This 
implies the equations: 


L7y4 = Av" + Bu®, L7!p4 = Cv" + Du?. (15.4) 


We conclude that: 


(a) In order to specify the (planar) Lorentz transformation defined by the vectors 
u“, v% itis enough to compute the unknown coefficients A, B, C, D describing 
the transformation. 

(b) The transformation is defined by equations (15.1), (15.2), (15.3), and (15.4). 


870 be precise this is a transformation in the tangent space of M* but due to the flatness of M* it 
can be reduced unambiguously to a transformation in M*. 

°The interested reader can find more information on this principle in Bruzzi V and Gorini V (1989) 
“Reciprocity Principle and the Lorentz transformation” J. Math. Physics 10, 1518 — 1524. 


15.2 The Covariant Lorentz Transformation 557 
15.2.2. Computation of the Covariant Lorentz Transformation 


We introduce the quantity (we consider c = | in what follows): 
y =u" vg. (15.5) 
(this is the gamma factor relating the two observers) and use equation (15.3) to get: 
(Lu“, Lv®) = (u“, v") = —y => (AC + BD) + (BC+ AD — l)y =0. 

From the length of the four vectors we have: 

v4vg = -1 = A*4+2yAB4+ B*=1 
and 

u4ug =-1>C?4+2yCD+D? =1 
The first equation of relation (15.4) gives: 

u? = LL~'u4 = L(Av? + Bu“) = (A* + BC)u“ + (AB + BD)v% 

hence: 


A*+BC=1 
AB+BD=0. 


Similarly the second part of (15.4) gives: 


D*+ BC =1 
AC+CD=0. 


Finally, after some simple manipulations, we have the following system of seven 
non-linear equations in the four unknowns A, B, C, D: 


BQyA+B—C)=0 (15.6) 
C2yD+C—B)=0 (15.7) 
B(A+ D) =0 (15.8) 
C(A+D)=0 (15.9) 


A? — D? =0 (15.10) 


558 15 The Covariant Lorentz Transformation 


A? + BC S1 (15.11) 
(AC + BD) + (BC + AD — l)y =0. (15.12) 


We expect that the system has many solutions, therefore there are more than one 
Lorentz transformations satisfying the conditions of Definition 15.2.1. A simple 
analysis shows that the solutions of the system can be classified by the vanishing or 
not of the quantities A + D, A — D. It is an easy exercise to show that the system 
admits the following families of solutions (AD 4 0): 


1.A—-D=0,A+D40&> A=D<ZO0. 


A BCD 
+1004 


1 


2,.A+D=0, A-D40—> A=-D#O0. 


A B C OD 
10 2y +1 
rt] +2y 0 +l 


We end up with a total of 2 + 4 = 6 solutions. Replacing the values of the 
coefficients A, B,C, D we find the action of the Lorentz transformation (not the 
transformation per se!) on the four vectors uv“, v%. The results are collected in 
Table 15.1. 

We observe that cases V, VJ are identical with cases IV, JJ respectively if 
we interchange u“ <> v“. This is to be expected because cases V, VJ express 
the inverse transformation from u“ to v“ and we have assumed that u“, v% are 
completely equivalent. Therefore without loss of generality we may restrict our 


Table 15.1 The possible Lorentz transformations 


A B Cc D Lorentz transformation and the inverse 
a a : ed a 
I 1 0 0 1 Lu Lee 8) Uu 
Lo u4 _ v4, L7'v4 = 44 
@ =. git a 
I -1 0 0 -1 sir aa as 
Luff = —v", Lt = —u* 
Lu? = 2yu4 — v*, Lv4 =u" 
Il 1 0 2 -1 
. Lo'ut = v4, L7!v4 = 2yv4 — u4 
Lu® 2yu% + v%, Lv4 ut 
IV. —l 0 —2y 1 —l,,a a -lja _ a a 
LVus =—v", Lv 2yu" +u 
Lu? = —v%, Lv* = —2yv% + u4% 
V. 1 —2y 0 —l -l,ai__ a a -lja__sa 
L uf = —2yu* + 0%, Lv u 
VI = ay 0 1 Lu? = v*, Lv* = 2yv* — u4 


15.2 The Covariant Lorentz Transformation 559 


considerations to the first four cases J — IV only. By doing so we introduce an 
asymmetry, which however can be ignored if we assume that the direct Lorentz 
transformation is from v“ — u%. This implies then that the coefficient B = 0, that 
is, we define the generic Lorentz transformation by the relations: 


Lv‘ = +u", Lu? = Cu’ + Dv". (15.13) 


The coefficient +1 in the rhs of the first equation (15.13) is due to the fact that 
the Lorentz transformation preserves the magnitude and the direction of the four- 
vectors but not necessarily the sense of direction. 

We compute now the Lorentz transformation itself. From equations (15.13) we 
expect that the generic expression of the Lorentz transformation in terms of the four 
vectors u“, v% will be of the form: 


LS, = 14, + Aru“ up + Byu* uy + Cyv* vp + Div up (15.14) 


where / 5 is the identity transformation and the coefficients A;, By, C;, Dj, are 


computed in terms of A, B, C, D by means of the action of the transformation on 


u“, uv“. For example the action of L (on the left) on u“ gives: 


Lu? = u* + Ayu’ (—y) + Byu4(—-1) + Cyv4 (yy) + Dyv*(-1) 
= (1— Ajy — By)u* — (Cjy + Dy)v". 


Comparing this with the second of (15.13) we find the equations: 


(=AipeRiSe (15.15) 
Ciy + Di =—D. (15.16) 


Similarly the action of L on v“ gives the equations: 


1-Diy-—C)=B (15.17) 
Ai+ Biv =A. (15.18) 


The solution of the system of equations (15.15), (15.16), (15.17), and (15.18) is: 


1 

aes DT ya) 

B- — GA+1=c}j 

P= aye 
1 

Ci) = ——> (yD4+1-B) (15.19) 
1-y? 


1 
Di = B D). 
1 f=92"" yD) 


560 15 The Covariant Lorentz Transformation 


Table 15.2 The distinct Lorentz transformations 


A B {Cc D A B Cc D 
I 1 0 0 1 = =a + 
| 0 0 I see eT ae ee 
m. {1 0 |2y 1 TH lb in lw 
Iv. |-1 0 |-2y |1 = & ip lt 
V. Same as IV if u <> v 
VI. | Same as /// if u <> v 


Using the values of the coefficients A, B, C, D for each solution we compute the 
value of the coefficients A;, B,, C;, D , and consequently the Lorentz transforma- 
tion for each case. The result of the calculations are collected in Table 15.2. 

We conclude that the covariant expression of the generic Lorentz transformation 
(that is, the one which covers all possible cases!) and its inverse in terms of the 
initial coefficients A, B, C, D are: 


1 1 
ep a wee y — Aju" + 73 ae Up 


1 
ea 


1 
(vB—y—D)v*up. (15.20) 
_ y2 


i—_ 


(L7")4, = 6§ + 3 (YB —y — D)utvy + —a (yD + 1— B)u*uy 


_ 


1 
age hts Oe ie 


1 a 
i= te y —A)v'up. 


1— 
(15.21) 


We note that the transformation is defined solely in terms of the 4-vectors u“, v% 
as requested. These expressions are important because: 


1. They are fully covariant (independent of any coordinate frame) 
2. They depend only on the 4-vectors u“, v% that is, the four velocities of the 
observers defining the transformation. 


Exercise 15.2.1 


(a) Prove that the REREE Lorentz transformation does satisfy the isometry condi- 
tion (L— a ee Nea L4 b> = Nab Where Nap is the Lorentz metric. 

(b) Prove that the inverse generic Lorentz transformation is found from the direct 
by interchanging u“ <—> v*. 


15.2 The Covariant Lorentz Transformation 561 


Table 15.3. The covariant form of independent Lorentz transformations in terms of the four 
velocities 


Lorentz transformation Type 

I Lh = 63 cy (u“ — v“) (up — vp) Space inversion 
LN, = 8§ + 0% — uv — us) 

Tl. Li = 6 iy (u“ + v“)(upy + vp) Time inversion 
LO", = 88 + phy (0% + u)(up + us) 

Il. Li = 63 hy (u“ + v“)(up + vp) — 2u“ vp Proper 
LN, = 88 + ppp 08 + u(y + uy) — 2v%u, 

IV. Lh, = 5; a (u% — v*)(up — Vp) + 2u% vp Spacetime inversion 
LN, = 89 + A (04 — uw) (up — up) + 204Up 


Replacing in these expressions the values of the coefficients A, B, C, D of 
Table 15.1 for each case (or using Table 15.2) we find the covariant expression for 
the corresponding Lorentz transformation. The results are given!? in Table 15.3. 


Exercise 15.2.2 


(a) Prove that in cases I, II the Lorentz transformation L satisfies the property L = 
L! ie. L? =1. Such operators we call spacetime reflections. 


In the last column of Table 15.3 we give the characterization of each type of 
Lorentz transformation with that we computed in Chap. |. The justification of this 
will be done below when we derive the action of each type of Lorentz transformation 
in terms of coordinates. 


15.2.3 The Action of the Covariant Lorentz Transformation on 
the Coordinates 


The expression of generic Lorentz transformation we found treats the transformation 
on an equal footing wrt the four-vectors u“, v*. However this is not the standard 
practice where one considers the Lorentz transformation from one LCF to another. 
To reconcile the two approaches we consider the action of the Lorentz transforma- 
tion on the components of a four-vector. Indeed using the expression (15.20) it is a 
straightforward matter to write the transformation equation and the inverse of any 


'0The expression of the proper Lorentz transformation given in Table 15.2 coincides (after some 
rearrangements) with that of Krause. 


562 15 The Covariant Lorentz Transformation 


four vector x“ in generic and covariant form (recall that B = 0): 


1 
Payee a {live y A) up + (VA +1 = C)uplx?h uf 


+ [ID +1) w- (7 + D)up]x?} 0%, (15.22) 
LY 
a4_ ya | =b\ ,,a 
x? = FF 4 5 {live y- Ale WASI-~O wz Jo 
Ly 
1 
+ [1D + Duy — (y + D) 3] w. (15.23) 
l-y 


From this we calculate in Table 15.4 the covariant expression of the action of 
each type of Lorentz transformation and its inverse on an arbitrary four-vector x“. 
Notation: SI = Space Inversion, TI = Time Inversion, PLT = Proper Lorentz trans- 
formation, STI = Spacetime Inversion. 

In order to find the standard vector expression of the Lorentz transformation, we 
have to consider the proper frame of one of the defining 4-vectors. Take u% as the 
reference vector and denote its proper frame by &,,. Then we have: 


Pa\,) w= (7) (15.24) 
0/5, uv? S, 


Table 15.4 The covariant form of independent Lorentz transformations in terms of coordinates 


eee Tine 

I. Lax? = xt 4 = [(up = vp) x? | (u“ — v“) SI 
(L7Ngxh = x + + [op — up)x?] (v? — wu) 

IT. Lex? =x er [(up vp) x? | (u4 + v") TI 
(Lo"ypx? = x9 + 7 [Cop + up) x?] (v? +4) 

Iii. Lex? — x4 rey [(up vp) x? | (u“ + v4“) —2 [upx?] ut PLT 
(L7!)gx? = 77 iy [(vp up)x? | (vf +u%) —2 [upx? | v4 

IV. ie = x9 + [up - up) x? | (u* — v*) +2 [vpx” | ut STI 
ex? =xt+ = [(vp - up) x? | (v4 —u%) +2 [upx” | vt 


15.2 The Covariant Lorentz Transformation 563 


Let us consider x“ to be the position four-vector!! and let us assume that x“ = 


(;) . Then from (15.22) we have: 
x 


ul 


fve-r-o(7,). (2) I(t), 
oe {0 as1-o(4), o(!) }(), 
fvo+0(2) (2). 10), 


1 ; 1 l y 
as 1—y? f oe GF . (:).| tem 


which leads to the following generic transformation equations: 


ee [o+rc-A]lyw-nt+QD+ol (15.25) 
*(yD+1)(v- 
Yo=rt+ vy i Lee y Div. (15.26) 


Exercise 15.2.3. Prove that the vector expression of the Lorentz transformation we 
derived in equations (1.74), (1.75), (1.76), and (1.77) of Chap. 1 are recovered from 
the generic relations (15.25) and (15.26) for the various values of the coefficients 
A, B, C, D. 


To specialize further and obtain the boosts we demand: 


v x 
v=| 0 r=]y 
0 z 


Lu Xu 


This is not necessary but it will help the reader to associate the new approach with the standard 
formalism. 


564 15 The Covariant Lorentz Transformation 


Then the (generic) transformation equations become: 


1 
i =e [yp +vc-Alyxw+ D+ 


x’ =-yDx+yvDl (15.27) 


Exercise 15.2.4 Compute the boost along the common x-axis for each type of 
Lorentz transformation. Show that the results coincide with those of Sect. 1.7 as 
well as with the results of Exercise 15.2.5 below. 


Another useful representation of the covariant Lorentz transformation is in the 
form of a matrix in a specific coordinate system. For example in the proper frame 
x, of u“ we consider the decomposition (15.24) of the four-vectors u“%, v“ and 
compute easily (v“ = v): 

C+yD ae (vC+y?D—A)yv, 
Ly); = (15.28) 
yDv& 6h + = (yD +1) y?v"v, 


In writing (15.28) we followed the convention that the upper indices count columns 
and the lower indices rows, whereas the Greek indices take the values 1, 2,3. Fora 
boost this matrix becomes: 


C+yD 2 vO ty D=A) 0 
Luv); =} yDo yD 0 (15.29) 


0 0 ou 
where the indices K, L take the values 1, 2. 
Exercise 15.2.5 Prove that the determinant of the generic boost (15.29) equals: 


detL,(v) = —AD. (15.30) 


Conclude that in the cases I, II (spatial and temporal inversion) the det Ly(v) = 
—1 whereas for the cases IIT, IV (proper Lorentz transformation and spacetime 
reflection) detLy(v) = +1. This shows which types of Lorentz transformation 


15.2 The Covariant Lorentz Transformation 565 


do not constitute (by themselves only!) a group (because they do not contain the 
identity). !? 


Exercise 15.2.6 Show that the matrix representation of each of the four types of 
Lorentz transformation is the following: 


Case I (Space Inversion) (C = 0, D = 1, A = 1). 


General: 
Y YU 
Lu); = , (15.31) 
2 
yo" dy + footy 
Boost: 
y -yvv 0 0 
; a 0 0 
tei” 2° 15.32 
(u); 0 0 10 (15.32) 
0 O 0 1 
Case II (Time Inversion) (C = 0, A= 1, D= 1) 
General: 
— YUpu 
Lu)i = : : (15.33) 
—yul 5 + Ty ey 
Boost: 
-y yv 00 
= 0 0 
fie |" F 15.34 
(u); 0 0 10 ( ) 
0 0 O01 
Case III (Proper Lorentz transformation) (C = 2y, A= 1, D = —1) 
General: 
Y —VUn 
Ly(v)j = , : (15.35) 


—yyk gt ayy 
yur by + TEU vy 


The set of all four types of Lorentz transformations constitutes a group. This group has four 
components. Of those only the subset of the proper Lorentz transformations form a group, which 
is a subgroup of the Lorentz group. 


566 15 The Covariant Lorentz Transformation 


Boost: 
y -yv 0 0 
- 0 0 
Lyi =] YP Y 15.36 
woy=| ore 8G (15.36) 
0 0 0 1 
Case IV (Spacetime Inversion) (C = —2y, A= —1, D = 1) 
General: 
-y VU 
Lu) = . (15.37) 
yout 6b + ity, 
Boost: 
-y yv 0 0 
i -y 00 
Lui =|" —Y 15.38 
Oro 6 4.6 on 
0 0 0 1 


We note that all four types of Lorentz transformation can be written in covariant 
form. This means that it is not necessary to study the relativistic problems with the 
proper Lorentz transformation only and one can use equally well all other types of 
Lorentz transformation. However this is not done in practice and very seldom in the 
existing literature. ! 

In the following sections we shall consider various simple but important applica- 
tions of the covariant Lorentz transformation, keeping always in mind the level of 
the present book. 


15.2.4 The Invariant Length of a Four- Vector 


Consider a four-vector w“ and the Lorentz transformation L(u, v) defined by the 
unit timelike (non-collinear) four-vectors u“, v*. Consider the decomposition: 


w* = ayu* + anv + wt 
where w{ = p(u, v)ew? is the normal projection of w“ on the 2-plane of u“, v%. 
We introduce the invariants A, = w%ug, Ay = w%vg (and y = —u“%vq) and 


'3See for example the book “The Physics of the Time Reversal” by Robert Sachs (1987), The 
University of Chicago Press. 


15.3. The Four Types of the Lorentz Transformation Viewed as Spacetime. . . 567 
compute easily: 
Au =—aj—a2y, Ay =—-ay — a). 


Reversing these relations we compute a), a2 in terms of A,, A, and find that: 


1 
w= {(y Ay — Au)u* + (vAu — Ay) v*} + wi. (15.39) 
1-y? 
It follows that the length w* = —w“wz, of w® is given by the relation: 
2 1 2 2 a 
wim (-a2 Az 4 2yvAvAu) + wl wa. (15.40) 


This result is important because w is Lorentz invariant. This means that to any four 
vector we have associated a Lorentz invariant quantity written in covariant form. In 
later sections we shall discuss the use of this invariant. 

For future reference we note that the action of the generic Lorentz transformation 
on the (arbitrary) four-vector w® reads: 


D (yAy — Ay) v4 +04. 
(15.41) 


1 
ay = {(vA — C)Ay + (vC — A) Ay} u* + 
1—y? 


1—y? 


15.3. The Four Types of the Lorentz Transformation Viewed 
as Spacetime Reflections 


We have already pointed out that the first two types of the Lorentz transformation 
satisfy the property L? = J, that is their action twice produces no change. Such 
operations we have called spacetime reflections. In this section we take the subject 
further and show that all four types of the Lorentz transformation can be described 
in terms of spacetime reflections. 


Let n“ be a unit vector so that €(n) = +1. Then the tensor: 
Nenyp (1) = 65 — 2e(n)n“np (15.42) 
reflects n“, that is, Ne~ny{(n)n? = —n*. 


From the two vectors u“, v* we can define two new vectors. The spacelike unit 
vector w? = ——L— (v4 — vy“) and the timelike unit vector s¢ = —2—(u4 + v*). 
Jrya ‘ ) yah ee +") 


Then it is easy to see that the spacetime reflection along w%, i.e. N+/(w), is the 
space inversion Lorentz transformation and that the spacetime reflection along 


568 15 The Covariant Lorentz Transformation 


s?, ie. N_3(s), is the time inversion Lorentz transformation. This explains the 
spacetime reflection property L? = J of these two types of transformation. 

The proper Lorentz transformation cannot be described by means of a single 
spacetime reflection operator. To see this let us denote this transformation as L3/ 
and assume that it can be written in the form: 


L3f = 64 + km“mp (15.43) 


for some unit vector m“ and some scalar factor k. Writing m* = au“ + Bv% +1 m* 
and demanding that the resulting form of the transformation coincides with the one 
given in Table 15.3 we arrive at a contradiction. However we can represent L3/ as 
the product of two spacetime reflections. Indeed it is easy to show that L3 can be 
written: 


L35 = (66 + 2u“uce) («: + (uS + v°)((up + u)) (15.44) 


l+y 
that is!*: 
L35, = N_}(u)N_§(s). (15.45) 


It is interesting to note that the product N_/ (v) N_{(s) produces the type V Lorentz 


. . 14 
transformation we have neglected, and corresponds to the inverse L3 A b 
Working similarly with the spacetime inversion L4f we show that: 


Lat = (82 + 2u“ue) (s: abs 


St v°) (Up »») (15.46) 


which implies, 
Lay, = N_}(u)Ny5(w). (15.47) 


Therefore we have described all four types of the Lorentz transformation in terms 
of spacetime reflections along the reference vector uv“ and the characteristic vectors 
w*, s@. 

One application of this result (see Krause ibd) is to compute the transformation 
matrix S(L) corresponding to the Lorentz transformation L in the Dirac four-spinor 
transformation law. We consider the Dirac 4 x 4 y matrices which are defined by 
the condition: 


vty? +y?y? = 2n”. (15.48) 


'4See Krause J (1977) “Lorentz transformations as space-time reflections” J. Math. Physics 18, 
879-893. 


15.3. The Four Types of the Lorentz Transformation Viewed as Spacetime. .. 569 


Lg 2iayd 


We introduce as usual the matrix y> = y°y!y?y? which satisfies the well known 


properties: 
yyity7y?=0, (Py = i. (15.49) 


The invariance of Dirac’s equation under a Lorentz transformation L? implies the 
condition: 


y* = L§S(L)y?S""(L) (15.50) 
where S(L) is a non-singular matrix associated with the transformation L?. Taking 
L‘, to be a spacetime reflection Nein), (n) = 5) — 2€(n)n“ny we compute the 
commutator: 

[y*, S(L)] = —2e(n)n“S(L) fr (15.51) 
where A = nay“. Multiplying (15.48) with n? we get the anticommutator: 
{y? A, Ay*®} = 2n*. (15.52) 
To make (15.52) a commutator we multiply with y and use (15.49) to get: 
[y*. 7° Al = —2e(n)y?n4 (15.53) 
which by virtue of the identity ” A = €(n) is written: 
[y“.y° Al = —2n"(y? A) A. (15.54) 
Comparison of equations (15.51) and (15.54) gives: 
S(L)=y> p (15.55) 


up to a constant. Let us denote the space(time) reflection Lorentz transformation 
along the vector w® (resp. s“) by Li (resp. L2). Then we have: 


SLi =? w= 


1 
Figs” p) (15.56) 


Sila) =? f= 5 Caet p). (15.57) 


1 
Vy +1 


570 15 The Covariant Lorentz Transformation 


Concerning the proper Lorentz transformation we have from (15.45): 


S(L3) = S(N_(u))S(N_(s)) = > y(t p) 


1 
Vy + 1) 
1 


= S———— (- 1+ 15.58 
Jig thy: Ap) (15.58) 


Similarly for the spacetime reflection we use (15.47) to find: 


= gd 1 5 
S(La) = S(NW)S(N-(w)) = * fh —E y= fh) 
1 
= —-——— _ (1+ F 15.59 
Gotta» (15.59) 


These results are manifestly covariant. 


15.4 Relativistic Composition Rule of 4-Vectors 


In this section we employ the covariant Lorentz transformation to discuss the 
relativistic composition rule of four-vectors. Usually we refer the relativistic compo- 
sition rule for 3-velocities and 3-acceleration but as we shall show the composition 
tule is general and applies to all four-vectors and of course to all types of Lorentz 
transformation. 

The reason that we pay so much attention to the composition rule of 3-velocities 
is historic and is due to the fact that this rule was used to prove that the velocity 
of light was incompatible with the Newtonian composition rule of 3-vectors and 
therefore a new theory of Physics had to be introduced (of course none is so unwise 
to say that Newtonian Physics had to be abandoned!) Furthermore it was shown 
that the behavior of the velocity of light was compatible with the composition rule 
proposed by Special Relativity, a fact that contributed to the acceptance and the 
further development of that theory. 

Before we discuss the relativistic composition rule for four-vectors we examine 
the corresponding rule of Newtonian Physics.'° Let us start with the Galileo 


15 See also Sect. 6.3. 


15.4 Relativistic Composition Rule of 4-Vectors 571 


YA 
y' A 
a (x = 3,y = 3) 
gi = eal ‘ ‘ 
Sie, 7 ) 3i+ 3j 
P x 
a 
3i+j 
Oj £ O 
(a) Passive view (b) Active view 
Fig. 15.1 Passive and active interpretation of a transformation 
transformation for the position vector!®: 
rp) =Ty(O, O')rp (15.61) 


As we know there are two ways to look at a transformation: The passive and the 
active view. According to the first view we consider one vector (more generally 
tensor) and two coordinate systems and the transformation transfers the components 
of the vector (respectively tensor) from one system to the other leaving the vector 
(respectively tensor) the same. In the second point of view we consider one 
coordinate system and two vectors (respectively tensors) and the transformation 
connects one vector (respectively tensor) with another in the same coordinate system 
(see Fig. 15.1). 
For example the passive view of the Galileo transformation is: 


x’ =x—v,t, yl =y—ovyt, 2 =z—uvet (15.62) 
and the active view is equation (15.61). The active view contains more of the math- 


ematical information of the transformation whereas the passive view is necessary in 
the computations. Concerning the Lorentz transformation in a similar manner the 


'6Many times this transformation is written 
rv = Ar-—vt (15.60) 


where A is a Euclidean (orthogonal) rotation matrix and v is the velocity of ©’ wrt ©. This relation 
is not more general than (15.61), and the matrix A is not needed because relation (15.60) is a vector 
equation. The matrix A is needed only when (15.61) is written in a coordinate system in which case 
it describes the relative rotation of the 3-axes. 


572 15 The Covariant Lorentz Transformation 
passive view of a boost along the common x —axis is: 
'=y-vt), x =y@-v), y=y, =z (15.63) 


and the active view for the general Lorentz transformation is (see (15.22)): 


i vi 1 ; atl st 
x = [(vC —y —A)vj+(vA4+1—C)uj]x/ fu 


oP — {[aD + l) vj - (y + D)uj]x!} vi (15.64) 

After this detour into the two views of a transformation let us return to the 
Euclidean 3-space where we describe Newtonian motion and let us consider a 
moving point mass. Let O, O’ be the origins of the coordinate systems of two 
Newtonian observers I and IT’ and let vq and vy be the velocities of a point mass 
wrt IT and IT’ respectively. The Newtonian law of composition of velocities requires: 


Vr’ = Vr — VYo,o’ (15.65) 


where Vo. g is the relative velocity of the observer II wrt IT’. We can regard equa- 
tion (15.65) as a transformation in a linear three dimensional space whose vectors 
are the velocities (this is the tangent space of the Newtonian three dimensional 
space). In this space the transformation (15.65) is the Galileo transformation! That 
is we have: 


vy =Tu(O, Ovo (15.66) 


where u is the relative velocity of TT’ wrt I. 
A similar result holds for the acceleration, that is Newton’s composition rule for 
acceleration is: 


ay =Ta(O, O’)an (15.67) 


and similarly for any other vector. This is expected because according to the Galileo 
Principle of Covariance, the Galileo transformation concerns all Newtonian vectors 
(and tensors) and not only the position vector. 

From the above analysis we conclude that: 


The Newtonian composition rule of Euclidean vectors (respectively tensors) is equivalent 
to the active view of the Galileo transformation in the corresponding linear space of the 
relevant vector (velocity space, momentum space, acceleration space etc.)(respectively 
tensor). 


15.4 Relativistic Composition Rule of 4-Vectors 573 


Fig. 15.2, Composition of 
Lorentz transformations 


Based on this conclusion we define the Law of Composition of four-vectors in 
Special Relativity as follows: 


Definition 15.4.1 Consider the LCF © and &’ with four velocities u! and v! 
respectively (iu; = viv, = —1, ul # +v') and let Liu, v) be the Lorentz 


transformation defined by the four-vectors u', v!. Let wp, a four-vector in the 
tangent space of &’ at the point P’, with position vector on on the straight line 
(=the cosmic line of ©’) defined by the four-vector v“ (see Fig. 15.2). We define the 
point P on the straight line (=the cosmic line of ©) defined by the four-vector u’, 
by the position vector: 


Fae ied ye ae (15.68) 


u,v) j 
The tangent vector wp in the tangent space Tp M which is defined by the relation: 
wh = (Ly Liwh: (15.69) 
we call the composite vector under the Lorentz transformation Liu, v) and 


postulate that it defines the Composition Rule of four-vectors in spacetime. 


In the following when the four-vectors u', v! are understood we shall omit them, 


that is, instead of writing ies we shall simply write L~!. 


574 15 The Covariant Lorentz Transformation 
15.4.1 Computation of the Composite Four- Vector 


Using relation (15.23), which gives the inverse Lorentz transformation, we compute 
the generic form of the composite four-vector wp : 


wp =p +i {[wc y —A)uj t+ (vAt1—-C)vj] w>| vi 


1 
i T-y2 {[aD + uj —(v + D) vj] wh} a 


1 
i 1 a i j 
= wp + Ty {(vC —y —A)A, + (VA4F1-C) Aj} 
+a l@D + DAL + DAL}! (15.70) 


where we have introduced the quantities: 


Al, = wu, Al = wh. (15.71) 

Special attention must be paid to the computation of the quantity A’, = wp Vj. 
Indeed this quantity involves the four-vectors won: v; which are defined at different 
points of M* therefore their contraction is meaningless. But M‘* is a flat space 
therefore we can transport parallel the four-vector w'y, from the point P’ to the point 
P along any path we wish, the parallel transport being independent of the path taken. 
'7 This parallel transport is not possible in a curved space in which transportation 
is path dependent. This is the reason we do not use the composition of four-vectors 
in General Relativity. This does not mean of course that people have not tried to do 
so.'® However, as expected, without any success. 

We continue with the computation of the zeroth component (A, = wu) and 


the spatial part (hi (u)w) of the composite four-vector in the proper frame of u’. A 
simple computation gives for the first: 


A, = (vA—7?C — Dy)A + (D—-y?A+ yo)A\ |. (15.72) 


—l 
1-—y2 


'7The parallel transport is defined by the requirement the transported vector at the point P has the 
same components with the original vector at the point P’ in the same (global) coordinate system 
of M*, 

'8For example see F: Felice “On the velocity composition law in General Relativity” Lettere al 
Nuovo Cimento (1979) 25, 531-532. 


15.4 Relativistic Composition Rule of 4-Vectors 575 


Concerning the spatial part we have: 


hi (uyw}=h' (u)w'p, + 


[yC-y-A)A, + (vAF1-C) AL] hiwo!. 
(15.73) 


1 
1-y? 
Replacing h'(w) = 5 + uluj we find: 


hi(u)wt, = wp, + Alu! 


1 
+ [= {(vC —y — AJA + (vA4F1-—C) A}! = yu’) 


= Wy + {(i+v4-y?c) A, -vy~At1-O ai} w 


1 . 
se eee a y-A)A,+(VA+1-—C)A,}o'. (15.74) 


The length of the spatial part is hj; (u)wi,w!, and it is computed to be: 


i j i j 2 
hij (u)wpw> = WpiW'p + (wpuj) 


2 
[4 yC—D)A.+(yvD+y’C— vA)A,,] 
(15.75) 


1 
_ aint 
— WpiWp + a _ ye 


where we have replaced the quantity wu i from (15.72) and used the fact that the 
Lorentz transformation is an isometry therefore w PiW' =w PiW'p,. We note that 
the rhs contains only WW and not wp. 

Relations (15.72) and (15.75) are general and hold for an arbitrary four-vector 
(null, timelike or spacelike) and all types of Lorentz transformation. Therefore they 
contain all possible rules of composition of all four-vectors in Special Relativity. 


15.4.2 The Relativistic Composition Rule for 3- Velocities 


In order to convince the reader that the general relations derived in the last section 
contain all known results as special cases we derive in the following the well known 
relativistic rules for the composition of 3-velocities and 3-accelerations for the 
proper Lorentz transformation. 

From Table 15.1 we have that the proper Lorentz transformation is defined by 
the values A = 1, C = 2y, D = —1. Therefore in the case of proper Lorentz 


576 15 The Covariant Lorentz Transformation 


transformation relations (15.72), (15.73), and (15.75) become: 


Ay = 2y Ai, — A), (15.76) 
: ao ; i j 1A 
h'(u)wp = hi (uw, + ee {— Qy +1) Ai, + Ai} hi wv! (15.77) 
and 
hij(ww>wh = wpyw', + [2vAl,— Al)’. (15.78) 


Suppose that he components of the four-vectors wp, wp: in the proper frame &, 


are: 
wi, = ( ee ) wi, = ( es 2 (15.79) 
YwW] YwiW ) s, 


and those of the four-vectors u', v' which define the transformation: 


We compute: 


where for emphasis we have replaced y with y,. 
Replacing in (15.76) we find for the zeroth coordinate the well known transfor- 
mation rule for y's: 


Yw = Ww¥w'l+v-w). (15.80) 


Concerning the spatial coordinate we find from equation (15.77): 


1 
YwoW = Yw'w + tay, {—Qyv + 1) (—Yw’) — YoYw' + YoYw'¥ W)} Yov- 
Vv 
(15.81) 


15.4 Relativistic Composition Rule of 4-Vectors 577 


Making use of (15.80)!° follows the well known formula (cf. with (6.37)): 


ail mew) 
w= ——jwst 1+ vV-w )vpe. 15.82 
iw | ro( 1+ yp wone 


Exercise 15.4.1 Consider the boost along the x—axis of X, with velocity v and 
Wx 

assume that the decomposition of the 3-velocity w in Xy is w = | wy . Then 
We/ » 

prove that the relativistic composition rule for the 3-velocities is given by the well 


known relations: 


’ wi w! 
ee, i=. ee" (15.83) 
1+vw’, YC + vw{) Yo + vwx) 


Furthermore show that: 


Wy —v Wy j W, 


f =, = ee — rT 15.84 
Me Tow 2 dow) A ow) a 


and verify the correspondence v <—> —v, w' <—> w. 


15.4.3 Riemannian Geometry and Special Relativity 


It is generally believed that Riemannian Geometry has no place in Special Relativity 
because Minkowski space M 4 is flat (=has zero curvature). This is true but it is not 
the whole story. Indeed as we have seen in Special Relativity besides the spacetime 
there are involved other linear spaces with physical significance such as the 3- 
velocity space, the 3-momentum space etc. Using the length of the spatial part of 
the four-vectors one can define in any of these spaces a Lorentz covariant, positive 
definite and symmetric (that is Riemannian) metric whose curvature does not (in 
general) vanish. 


an equivalent expression is (see Ar Ben-Mehanem, Am. J. Phys. (1985) 53, p. 62-66): 


1 {= ( | 
w= + {1-4 V. 
(d+v-w) lv yy wv 


The proof is simple. We have: 


2,9 ' 2 
-—1 ,-1 
14 a (vw) = 14 ae = =14 Ho (ww)v? =14 (vw )v?. 
1+ yw wA+yw) v YU + Ww) Yo 


578 15 The Covariant Lorentz Transformation 


In the current section we study the case of the 3-velocity space and obtain results 
which have been around for a long time. One can use the same approach to study 
the 3-space of other four-vectors and obtain new results. From (15.78) we have for 
the length of the spatial part of the composite four-velocity””: 


2 
Yaw? = —1+ (27)(—yw) + WwYw' — YoYw'v- W) 


2 
=-14[wwd+v-w)] 
=-l+y, 


where for emphasis we write y, in place of y and we have used (15.80). This implies 
the relation: 


w=l1-—. (15.85) 


Exercise 15.4.2 


(a) Prove that w* can be written as follows: 


1 
2 
a (15.86) 
Vv 


where Qy' =1+v-w. 
(b) Using (a) justify the following calculation: 


we = [a +vewy = v\(1 = w?)| 
1 2 12 / 1\2. 212, 
= o [«w +w*+2vw)+(v-w)* —vw | 
= 5 [wt wy? —wxwy], (15.87) 


Hint: Use the identity |a x b/? = ab? — (a-b)?. 
(c) Using the correspondence w’ <—> w, Vv <—> —v show that: 


1 


2 
w 


(w’)? = 1- 


(15.88) 


/ 


0Note that w! is a four-velocity hence w! w; = —1. 


15.4 Relativistic Composition Rule of 4-Vectors 579 


hence: 
1\2. 1 
(w)i =1- PuzOe (15.89) 
where Qy = 1 —v-w. Finally show that: 
(w’)? = ar [a —v-w)-(l-w)— w)| 

_ a [ + w? — 2vw) + (v- w)? — vw?| 

“Ge 

= a [w —w)*-(vx w)| (15.90) 


The quantity (w’) is positive definite and most important it is Lorentz invariant 
(because the quantity yy = —w’“ug is Lorentz invariant). Therefore it can be used 
as a Lorentz invariant, positive definite distance in the space of 3-velocities. This 
distance leads to a Lorentz covariant Riemannian metric which is not flat. Let us 
find this metric. 

In the space of 3-velocities we consider an “infinitesimal” change w = v + dv 
and have?!: 


(v—w)* = (dv) 
(v x w)? = |v x (vt+dv)/? = |v x dv|* = v7 (dv)* — (v-dv)? 


Oy =1—v-w=1-Vv- (v4 dv) = 1-v — v-dv. 


Replacing the result in (15.90) we find: 


we = a 7 [ (avy? W (dv)? + (vedv)"] 
—v — V:av 


- : 2 [(: v’) (dv)? + (v-dv)"]. 


(1—v? — v-dv 


21 Note that: v-dv = vdv. 


580 15 The Covariant Lorentz Transformation 


The term: 
1 _ 1 1 
(1-—v? — v-dv) (1- v)” | — (vay)? ° 
(1-v2)* 


1 [! 2(v-dv) 3(v- dv)" 


Replacing in the first term of (15.91) we find: 


(1 — v’) (dv)? + (v-dv)? 


+ O((dv)>). 
(Ivy . 


In order to compute the second term in (15.91) we introduce in the space 
of 3-velocities spherical coordinates (v,6,@) with the standard relation v = 
v(sin 8 cos g, sin@ sind, cos @). In these coordinates: 

(dv)? = (dv)* + v*d0? + v’ sin? Odd” 
hence: 


—v* (dv) + (v-dv)* = —v* (do + sin? 6d¢7). 


Replacing the results in equation (15.91) we find: 


2 
2 iy (1 aE ae + si 046) + O((dv)") 
—V —vV 
2 2192 2 oin2 2 4 
_ dv? + a we al app (dO? + sin? dg”) + O((dv)*) 
— Vv _ 

2 2 

= 4)” (ae? + sin? 06) + O((dW)». (22) 


7 (1-v2)?_ 1-v? 


The quantity w” is the required distance in the space of 3-velocities if we neglect 
third order terms in dv. Then the rhs defines a Riemannian metric ds? in that space 
as follows: 


d 2 2 
ee = (d6 + sin? dg”) (15.93) 
Uv 


ds* = —__ 
(1-2)? ib 


15.4 Relativistic Composition Rule of 4-Vectors 581 


or, in more standard notation: 


gij = diag : i 26 (15 94) 
ij — sin . . 
“ (1 vy 1—v?’ 1—v2 


The contravariant metric is: 


. 21-7 1-1 
me 5 2 
gt = diag ((1 vy, 5 5): (15.95) 


It can be shown that the space of 3-velocities endowed with this metric becomes 
a Riemannian space of constant negative curvature. If we introduce the rapidity x 
with the relation y = cosh x we have: 


1 v? d 
1-v= am 5 =sinh*y, dv= x 
cosh* x l-v cosh* x 


from which follows: 
ds* = dx? + sinh’ x (dd? + sin? 6d¢’). (15.96) 
For small x the sinh x ~ x ~ v and the space of 3-velocities is flat, as expected at 


the Newtonian level of small velocities.72 


Exercise 15.4.3 Consider the metric (15.94) and define the Lagrangian: 


Vv Vv -2 2 -2 
L=—, +—,z( +sin’ 6¢ ). (15.97) 


(1—v2) 1—v2 


where dot means derivation wrt an affine parameter along a geodesic in 3-velocity 
space. 


(a) Show that Lagrange equations are: 


= 2 - 2 
6 + ——— v0 —sin@cosdd =0 (15.98) 
v(1 — v2) 
= bg 2 - 
¢ + 2cot6¢e + ———~-v¢ = 0 (15.99) 
v(1 — v2) 
= 2v. 12 2 2742 
ut i=" —v(@ +sin“ Od )=0. (15.100) 


2More information on the geometry of the 3-velocity space can be found in V. Fock (1976) “The 
Theory of Space, Time and Gravitation” 2nd Revised Edition, Pergamon Press. 


582 15 The Covariant Lorentz Transformation 


(b) From Lagrange equations conclude that the non vanishing connection coeffi- 


cients~° are: 


2 1 2 : 
iy) vd — v3)’ I'33 = — sin@ cos@ 
t= : 3, = coté 
B-day | Tae 
2v 
1 1 - 2 1 
Th => Lape! 133 =-—vsin 0, M5 =v. 


(c) The Ricci tensor Rj; is defined by the relation: 
oT io Vie = ej = ry ry + Pili: 
Show that: 
Rij = —28ij- 
(d) The scalar curvature R is defined by the relation: 
R= g' Rij. 


Prove that R = —6. 
(e) The curvature tensor Rj jx is defined by the relation: 


1 
Rijxt = Bik Rji— g je Rit — i R jet 8 ji Rik — 7 (Bik8 il — 818 jk) R. 


Show that: 


Rijik = —(8ik& jl — Sil8jk)- 


(15.101) 


(15.102) 


(15.103) 


(15.104) 


(15.105) 


Conclude that the space of 3-velocities endowed with the metric w’ is a space 


of constant negative curvature (R = —6). 


?3Qne can also compute the connection coefficients directly from the (diagonal) metric by means 


of the formulae: 


: a ; a q 1 Og jj 
Ty = gg evel, T= girlosvigul. Ty 30 oi 


ri ; =O when all indices are different. 


15.4 Relativistic Composition Rule of 4-Vectors 583 


15.4.4 The Relativistic Rule for the Composition 
of 3-Accelerations 


We recall that the components of the 4-acceleration a! of a relativistic particle in a 


LCF © are (c = 1): 
Pe a0 
? dagu + yew x 


where u is the 3-velocity of the particle in X, a = om is the 3-acceleration of the 
particle in & and ay = yi (u- a). 

Let P, P’ be the points in Minkowski space related by the Lorentz transformation 
and let a. and a the corresponding four acceleration vectors resulting from the 
transformation. We consider a LCF &, (the same for both accelerations!) in which 
we Shall relate the components of the accelerations. In &,, we have the components: 


/ 
a 2 2—pe = ae De ocf| 
agw+y,a Sy ayw + y,,a x, 


/ eos . 
where ag = ya (w-a), ay = ys, (w’-a’), a= a a= a. (all quantities referring 


to &,,). We compute (see (15.71)): 
Ay = —ay 
Ay = —a0 
Ay = —agyy + Wv-(agw + 77/8). 


To find the transformation of the zero coordinate we replace in equation (15.76): 


2 
ay = W(1+v-w) c + hw . »| (15.106) 


Concerning the spatial part, from (15.77) we have: 


agw + yaa =ayw + yzal 
1 


+ 
1+ yy 


{- (2¥y + 1) (—ag) + (—a9) Yv + YwV-(agw + vara’) Yov. 


Replacing w from (15.82) and remembering that Q,,, = (1 + v- w’) we find: 


2 roy ao ! Yu ! 2o/ 
y,a = ayw — wer (1+ vew)y|+y2a 
w 0 —a_| v 1+, w 

Yu 


+ 
I+ Ww 


[a a Yv)ap ae Yodo w+ Ww grV , a’)] Vv 


584 15 The Covariant Lorentz Transformation 


/ 40 y 2/ 
= | dy — w+ V,,a 
( . oe) ~ 
Yu ’ ! 2 ! a) 
Sees j veal — 1 ; . 
+7, {40 + 1 Qwa) + ve a Oa +1 Ow) ¥ 
(15.107) 
The term multiplying w’ reads if we use (15.106): 
Fi an, Vly oi (15.108) 
‘ Vv Qw' Qw’ 


Similarly the term multiplying v gives if we replace ag from (15.108): 


2 
Yo 2 Vir 
ay {e+ Qwa+ orBna = («+ w-a)) d+ n0.)| 
2 
Yo Vapi / 
= (v-a). 
1+ Ww Qu 


Replacing these in (15.107) we obtain after some tedious but standard calculations 
the result: 


Yv 
1+ yw 


| Qua —(v-a)w — (v- ay] (15.109) 


ag Oh 


It is easy to check that (15.109) coincides with the transformation rule for 3- 
acceleration (7.20) derived in Sect. 7. 


15.5 The Composition of Lorentz Transformations 


Another application of the composite four-vector is the computation of products 
of successive Lorentz transformations. We consider three linearly independent unit 


four-vectors (that is, the velocities of three relativistic observers in relative motion) 


ui, v', w’ ‘ and the Lorentz transformations they define: 


Ly=Ltu,v), Ly =Ltu,w') Ly = Ltu,w). 
From the defining equations (15.13) of the Lorentz transformation we have: 


Lyv' = Au! (15.110) 
Lye = Cyt’ Dy (15.111) 


15.5 The Composition of Lorentz Transformations 585 


Lo'ui = Ay! (15.112) 


Lo'y! = Cy! + Dy! (15.113) 


where A,, Cy, Dy are constants given in Table 15.1. We recall that Ay = +1. 
Similar relations hold for the rest two Lorentz transformations L,, L,. Of course 
in each case we have to change the index in the coefficients and write Ay, Cy’, Dy’ 
for Ly and Ay, Cy, Dy for Ly. Let w' be the composite four-vector of w’ ? under 
the transformation L,. Then: 


w=L'y' wi=L, wv. (15.114) 
Let Ly, the Lorentz transformation defined by the four-vectors u', w!'. We consider 


the product of transformations fs Hee Se and study its effect on the four-vector u!. 
We have: 


LyLyLy)ul = Ly LyAwu! (Use (15.112)) 
= Apbylel, w" (Use (15.114)) 
= AyLyw" (Use (15.110)) 
= AyAyu'. 


We conclude that the action of the composite transformation leaves the length 
and the direction of the four-vector u' invariant but not the sense of direction 
(this changes when A,,A,, = —1). This means that the effect of the composite 
transformation is a spatial rotation in the spatial plane normal to the four-vector 
u“ (that is, the rest space of the observer u“). The set of all these transformations 
constitutes a group known as the little group or isotropic group of u'. The 
dimension of this group equals 3 and it is this group which makes possible the 
covariant 1+ 3 decomposition of a four-vector in temporal and spatial parts. We 
write: 


LyLyL;' = Ru). (15.115) 


The computation of the composite transformation R(w) is difficult and involved, 
especially if one follows the standard vector form of the Lorentz transformation. 
However using the covariant Lorentz transformation one computes relatively easily 
R(u) and certainly in covariant form! 

In the following we compute R(u) for proper Lorentz transformation (defined by 
Aw = Ay = 1) First we write R(u) in the form of a block matrix as follows: 


Rtu) = ee ) (15.116) 


586 15 The Covariant Lorentz Transformation 


where A is a Euclidean 3 x 3 matrix, that is A’A = 13, generating a rotation in the 
spatial plane normal to the four-vector wu! . 

From Table 15.3 we have for the proper Lorentz transformation and its inverse 
the expressions: 


1 


Ly = 8+ Tw tay + uy) — Qaly; (15.117) 
; . 1 . . . 
LN = 8+ To pO Tw Dug + 3) — 20%. (15.118) 


From equations (15.114) and (15.118) follows: 


. . il : yu’ + Ww ‘ 
we =w" + ——[(1+ ' uv u' 15.119 
ie [ld + ww — WI Lo, ( ) 
where we have set wy = —w"'vj. 
We express yy in terms of yy. 
From the transformation equation 
Ay = 2y Ai, — A, (15.120) 


of the zeroth component of the composite four-vector (see 15.76) we have in the 
current notation: 


Yu = 2, — V- (15.121) 
Now we are ready to work with the transformation R(u). We have: 


RO = Lgl; 


aly E +: — ; (ui + w')\(uj + wj) - 2u'u,| (15.122) 
= (LyL,)i + cc [Ly Ly(u + w)} (aj + wy) — 2 [Ly Lyw)} uj. 
The term: 
—2LyLyw! = —2Lyw" = —2u'. (15.123) 
The term: 


Ly Ly + w') = LiLyu! + Ly aLyw! = Ly (—v! + 2yyu’) +! 


= —Lyv! + WyL yu jaf 


15.5 The Composition of Lorentz Transformations 587 


= —Lyv! + 2yy(—w" + 2y, cu!) + u! 


= —Lyv! —2y,w' + (1+ 4yyyi)ul. (15.124) 


Using (15.117) we compute for the four-vector w”': 


1 , 1 
(Wt 2Wywr — yu! -(Yy + ww". (15.125) 


Lyvis=vi+ 
7 1+ yw 1+ Vw 


Replacing we find that the term: 


Lyle ie) Sau + 


9 : 
ae (4e yw + AW iy — Wb —2Wyw + yu! 


1 


+ (bh — vy — 2YvYy)w". (15.126) 
1+ Vw 


For the other four-vectors we find using (15.117): 


; 1 
Ly Ly)’, =8 + L+ yy +2 i+ Yy — My; 
(Ly vj j aayndt yh - Yw Yu Vw YW — plw'o; 
+ ——-w"w’, + ——u'v; 
1+ Yw J 1+ 
=p S 1 ae Le 
Yw wv wu + ——v'uj + ane 1) 
d+w)d + vw’) “  l+y 1+ Yw' 


g = etd Pe OL 2yv Va mn 
(1+ yw) + Yu) : 
1+W+20 + Wyw +2y2, i‘ 

(+ Ww) + yw’) 


J: 


Collecting the above intermediate results and after quite a load of calculations we 
obtain the final result”*: 


. 
RW) = 9) + BIG = vedv'vj — A+ yo) + Ywo wi 
+= y2)wiw! + Yur + 1+ yo t+ 3¥vYwr — 2W)w"vj] (15.127) 
where we have set: 


P=(1+yw)Ud+ww Ut Ww) = A+wdt+ rw IA+2nYw-w). (15.128) 


4Tn the calculations we can omit the terms containing u' because they vanish except the term u! uj 
which gives 1. 


588 15 The Covariant Lorentz Transformation 


In order to compute the Euclidean angle @ introduced by R(u) we use the 


standard formula”: 


TrR(u) = 1+2cos (15.129) 


where Tr R(u) is the trace of R(u). Using (15.127) we find: 


1 
[1 = y2,)uM op — 1+ wd + Ywv! wi, 


a (R# — 1) 
cos = = 
ale 2P 


+(L— yyw wl, + Ww +1 t+ wv + 3% yw — 2h)w' vy) +1. 
For the terms involved in this relation we compute: 


v“y, =—-1+y2 


a 2 
ww, = 1+ yj) 


wet. = i + y24/—1 +y2cos¢ 


v= —w"v; = wu, — pw = Vo Yw! — ‘i + yea) + ye cos @. 
Replacing we find the cosine of the angle 6 : 


(w — Dw — sin? ¢ 


1+ YYw + f—-l1t+y2J7-1+ v2 cos 


To compare this result with existing results in the literature we introduce the 
quantity t = ,/ aaa Then it can be shown that (15.130) reads: 


2 sin? p 
1+724+2tcos@ 


cosé = | 


(15.130) 


cosé = | 


(15.131) 


which coincides with the existing result in the literature.”° 


>See e.g. page 163 in H. Goldstein, C. Poole, J. Safko “Classical Mechanics” Third Edition (2002) 
Addison — Wesley Publishing Company. 

6See equation (10) of the paper “Wigner’s rotation revisited” Ar Ben-Mehanem, Am. J. Phys. 
(1985) 53, 62-66. 


Chapter 16 Mm) 
Null Triads and Proper Lorentz po 
Transformation 


16.1 Tetrads and the Lorentz Transformation 


A tetrad is a set of four linearly independent four-vectors. These vectors can serve 
as a basis in which one can decompose any other four-vector. If these vectors 
are orthonormal then the tetrad is called an orthonormal tetrad. Any tetrad whose 
vectors are non-null can be transformed to an orthonormal tetrad, because one can 
always choose a Euclidean transformation in SO(3) which will make the three 
spatial axes orthonormal (plus a dilatation if the original spacelike vectors are not 
unit). Furthermore if the timelike vector is not normal to the spacelike vectors 
(equivalently to the spacelike 3-space spanned by these vectors; such tetrads are 
called tilted tetrads) then by means of a boost one can always define a new timelike 
vector normal to the 3-surface (just take the projection of the timelike four-vector 
along the normal to the 3-surface). Subsequently one normalizes the new four-vector 
to have length —1 and ends up with an orthonormal tetrad. 

Since to an arbitrary tetrad we can always associate an orthonormal tetrad in the 
following we restrict our considerations to orthonormal tetrads only. 


Exercise 16.1.1 Write a program in an algebraic computer programme which gives 
the orthonormal tetrad when an arbitrary tetrad is given. 


In spacetime there is the natural orthonormal tetrad (NOT) {£(qa)} defined by 
the vectors Ein) = (1,0,0,0), Fay = (0,1,0,0), Ea) = (0,0,1,0), £3 = 
(0,0, 0,1). This tetrad is holonomic,! has positive orientation? and geometri- 
cally consists of the tangent vectors to the Cartesian coordinates {ct,x, y, z} in 


'This means that the Lie algebra spanned by the four vectors is the Abelian algebra. Equivalently 
the tetrad vectors can be written as 0,i where x’ are coordinate functions. 


?Positive orientation means that the determinant with columns the components of the vectors E,;) 
i =0, 1, 2,3 equals +1. 


© Springer Nature Switzerland AG 2019 589 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_16 


590 16 Null Triads and Proper Lorentz Transformation 


Minkowski space, that is, E(@) = a, Eq) = 2, Ea) = a Ee) = z. Because 


the proper orthochronous Lorentz transformation preserves the lengths and the 
angles between four-vectors and also the sign of the zero component it follows that 
the application of the proper orthochronous Lorentz transformation to each of the 
vectors of the NOT will produce a new orthonormal tetrad with the same orientation. 
Furthermore the coordinates of the new four-vectors will be the column entries of 
the Lorentz transformation in the new frame. Let us see an example. 

Consider the boost along the x—axis relating the observers ©, X’. Then as we 
have seen the Lorentz matrix is: 


y —By 00 
ta| bv vy 90 
0 0 10 
0 0 O1 


The application of the Lorentz transformation on the NOT of & gives the four 
vectors I, J, K, L which in the NOT of the new observer &’ are 


I¢ = yEy — ByE, = Lh, J“ =—By Ey + yb, = Li, K* = Ey = 13, L* = ES = 1 


(16.1) 
or 
y —By 0 0 
I’ = ae Jt= ; _Ki= : i= : (16.2) 
0 0 0 1 


More generally a four-vector A% which in the NOT {£(,)} of & has components: 
At = Aly E@)- (16.3) 
in the NOT of X’ has components At: 


At, = LEAS. (16.4) 


From the above we arrive at the following conclusion: 


Every proper orthochronous Lorentz transformation can be associated with a right-handed 
orthonormal tetrad. This is done by considering the components of each of the vectors of 
the tetrad as columns of the Lorentz matrix. The order we write the Lorentz matrix is that 
the first column is always the timelike four-vector and the order of the rest three spacelike 
tetrad vectors is such that the determinant of the resulting transformation equals +1. 


A consequence of the above conclusion is that there exists a close relation 
between positively oriented tetrads and the proper orthochronous Lorentz transfor- 


16.2 The Null Triad 591 


mation, hence we look for ways which define a tetrad in M* and examine how this 
is related to the proper orthochronous Lorentz transformation. In the following we 
shall discuss two methods which define a tetrad. The null triad consisting of three 
linearly independent future pointing? null four-vectors and the null tetrad which 
consists of two null vectors and two complex vectors. 


16.2 The Null Triad 


A direction in spacetime is determined by means of three direction cosines. Indeed if 
A; = (A®, A*, AY, A®)» is a four-vector in the frame of a RIO © then the direction 
of the A‘ in space-time for © is given by the three ratios A? : A* : AY : A. If A! 
is a null vector then we have the additional condition A! Aj = 0 which reduces the 
number of independent ratios to two. This means that & can picture the null vectors 
A! as a point on the two dimensional unit sphere which is constructed from the 
intersection of the null cone (where all null vectors reside) with the timelike plane 
(for x!) A° = 1. Indeed we have 


0 = A‘A; = (1, A)'(1, A) = -14+ A? 3 A? = 1. 


In the following we assume that the null vectors are future directed, that is A? > 0 
and the Lorentz transformation is a proper orthochronous Lorentz transformation. 

A triad of (future directed) linearly independent null four-vectors is represented 
by three distinct points on the surface of this sphere. Each of these points requires 
two coordinates to be fixed (e.g. longitude and latitude) hence the triad of null 
vectors requires six parameters to be specified as many as the parameters of the 
(proper) Lorentz group. Therefore the point transformations of the unit sphere will 
be in one to one correspondence with the proper orthochronous Lorentz group! 
Descriptively speaking we may say that the proper Lorentz transformation moves 
the three points of the unit sphere to new three points, acting on each point 
individually. This last remark is important. 

As we have discussed above the proper Lorentz transformation is related to 
orthonormal tetrads. Therefore we expect that an orthonormal tetrad should be 
possible to be associated (up to an orientation) with a null triad. In the following 
we exploit this idea and eventually express the proper orthochronous Lorentz 
transformation in terms of the null four-vectors of the null triad. 

Before we proceed let us mention briefly the history of this approach. The idea 
of null triad has been introduced for the first time by J. L. Synge in his classic 
book.* However he was not happy with the result; because the formulae were not 


Future pointing means that the zero component is positive. Under the action of a proper Lorentz 
transformation remains positive. 


47.L. Synge (1965) “Relativity: The Special Theory” North Holland Amsterdam p. 94. 


592 16 Null Triads and Proper Lorentz Transformation 


“symmetric” in the sense that one was not following from the others with some 
kind of cyclic replacement of the vectors involved.° In fact he reluctantly writes “a 
symmetric plan would be best, but none exists”. The basic idea of Synge was to 
use the two null vectors to define a timelike vector and a characteristic direction 
for these two vectors (details we shall refer below) and then use the third null 
vector and the timelike vector to define an additional spacelike vector which was 
allowing for the definition of the tetrad associated by the null triad. In this manner 
one has three options to choose the original pair of null vectors and this is the 
reason that Synge’s approach was not “symmetric”. Later G. R. Allcock® tried to 
rewrite Synge’s work in a symmetric way using the NOT tetrad. He proved that 
given three linearly independent future directed null vectors there is always an 
orthonormal tetrad in which the spatial parts of the null vectors are along the positive 
sense of the spatial axes of the tetrad. It is a good exercise to reproduce Allcock 
results. 


16.2.1 The Allcock Approach 


Consider three null vectors P“, OQ“, R® normalized so that P°Q, = P°R, = 
Q° R, = 1 (below in Lemma 16.2.1 we show that this is always possible therefore 
there is no need to consider the scale factors which Allcock considers) and require 
that there exists a tetrad such that the spatial direction of propagation of the null 
vectors is along the spatial axes of the tetrad. Let the tetrad be the vectors Et ) 
where the index J = 0,1,2,3 and E (0) being the unit timelike vector. Then we may 
right (p,g,r > 0) 


P* = p(E@ + Eqy), Q* = 4q(E@ + Egy), R =r(Eq@ + Egy) (16.5) 
and in components (in this tetrad!) 
P* = pdi,1,0,0), Q* =4d,0, 1,0), R=rd, 0,0, 1) 


In order to determine the four vectors E Cr) from the three vectors we need one more 
vector. We introduce the vector S¢ = n“’“4 P, 0. Ra. We compute easily that 


S* = pqr(E@) + Eq) + Ey) + Ey) and in components S* = pqr(1, 1,1, 1). 
(16.6) 


SJL. Synge was aman devoted to Geometry! And Geometry is in close terms with “symmetry” and 
“esoteric” simplicity, terms that only true geometers understand!. 


6G. R. Allcock “Synge’s triads of null rays” (1980) Am. J. Phys. 48, 410 — 41. 


16.2 The Null Triad 593 


Obviously S“ is a spacelike vector. From the system of relations (16.5) and (16.6) 
we compute the tetrad vectors in terms of the vectors of the null triad as fol- 
lows: 


1 1 


1 1 
Ej) = ——P* a R¢ St 
©) 2p 7 2q or 2r 2pqr 


1 a 
Ea) =—-E@+—P 

Dp 

1 a 
Ea, = -E@+-Q2 

q 

1 a 
Ee) = —Eo) + me 


These expressions although have a “symmetry” (in the cyclic sense) they are rather 
obvious and lack an elegant geometric representation and a clear kinematical 
interpretation. In the next section we develop briefly Synge’s approach and 
subsequently we present a new geometric approach of the null triad which is 
covariant and symmetric, thus satisfying Synge’s requirement. 


16.2.2. The Two Null Vector Approach (Synge) 


Let A“, B“ be two future directed null vectors. Then the inner product’ A? B¢ < 0. 


By rescaling® the vectors we can always make A? B“ = —1. We define the two 
vectors: 
[t= Sige + B®) (16.7) 
V2 
1 
J¢ = —~(A% — B*),. (16.8) 


J2 


Proof: 


; A° B 
Suppose that ina RIO © the A? = ~bpt= 
z= 


BB 


ASA 3 where A°, B° > 0 (because 


they are future directed). Then 


A° By = A°B°(-1+ A-B) = —A°B°(1 — cos6) < 0. 


8This assumption does not restrict the generality of the results. 


594 16 Null Triads and Proper Lorentz Transformation 


We compute the inner products 


U1) =—-1, VJ) = 0, VA) = UB) = - 


(IJ) =1, (JA) = -(JB) = = (16.9) 


The null vectors A“, B® in terms of the vectors /“, J“ are given by the relations: 


1 
Af = —(I% + J% 16.10 
Va! +4") ( ) 


1 
B? = —(I* — J*). 16.11 
7a! ) (16.11) 


From (16.9) follows that c/“ is the four-velocity of a characteristic observer and J“ 
is a characteristic spatial direction associated with the two null four-vectors A“, B®. 
Concerning the main property of the vectors /“, J“ we have the following 


a. The characteristic observer cI“ is the one for which the null vectors A“, B“ have 
common zero component (i.e. the same energy or frequency) He 


b. In order to find the main property of the spatial direction J“ we consider the 
projection tensor hgy = Nap + IqIp associated with the characteristic observer 
cl and the spatial projections of the null vectors 


b b 
AS =hiA’, BY =hpB’. 
We compute: 


1 1 1 
AP eA? a8 et A a Ae eS AS 4 By? 
ah. b b J2 2 Jf2 


I 
BY = hip? = (62 + 171,)B? = ——_J"*. 


«/2 


We conclude that the main property of the characteristic direction J“ in the rest 
space of the observer J“ is that the normal projections A‘ , B¢ of the four vectors 
are parallel to J“ and have opposite directions. 

In physical terms let us consider a pair of photons. Then the characteristic 
observer I“ is the one for which the photons have the same frequency and the 
characteristic direction for this observer is that for which the two photons propagate 
in his 3-space in opposite directions. 

Let us consider now three linearly independent future directed null four-vectors 
A®%, B“, C% which define a null triad. We have the following result 


16.2 The Null Triad 595 


Lemma 16.2.1 One can always rescale the vectors A“, B“, C“ of the null triad to 
obtain the vectors of a new null triad A*, B“, C% so that 


AB = C= BC, = S1 (16.12) 


Proof 7 7 _ 
Consider the three null four-vectors Af = mA‘, B* = pB*,C% = qC* where 
Mm, P, q are positive real numbers. Then condition (16.12) gives the three equations: 


mp(AB) = —1,mq(AC) = —1, qp(BC) =-1 
whose solution is: 


5. (Cr «=. cr 2 CRF 
ae ee. Meee eae” 


(16.13) 


where h = (AB)(BC)(AC) < 0. We conclude that given any three linearly 
independent null vectors it is always possible by rescaling to define a new set of 
three linearly independent null vectors which are normalized according to (16.12). 


Due to this Lemma, without restriction of the generality, in the following we 
assume that the vectors A“, B“, C“ of the null triad are normalized so that: 


(AB) = (AC) = (BC) = -1. (16.14) 


We consider the two null vectors A“, B® of the null triad and define the char- 
acteristic observer cI“ and the characteristic direction for this observer by (16.7) 
and (16.8) respectively. 

Next we consider the third null vector C%. Its spatial direction in the rest space 
of the characteristic observer cI“ is as follows 


Ct =hec? = (62 + 17)C” = C* + I*(-V2) = C* — At — BY 


The length: 


CL= ion -C4 = VJ (Ca — Aa — Ba)(C4 — A® — BY) = /—2Cg A* — 2Cg B4 + 2AG BY = V2 
The angle of C{ with the characteristic direction J“ is: 


JaC4 1 1 
cosdjc = — 


Ci, Ve 


that is the 3-direction C{ of the third null vector C“ in the rest space of the observer 
cI® is normal to the characteristic direction J“. We define the third spacelike unit 


Aq — Ba)(C% — A“ — B“) =0 


596 16 Null Triads and Proper Lorentz Transformation 


vector K“ as follows 


1 
K¢ = —+ = — (C* — A* — B*) = —* - I". (16.15) 
/2 


In terms of K“@ the third null vector C“ is as follows 


Ct = J2(K% + I). (16.16) 


Having the three vectors of the tetrad we define the fourth spacelike unit? four- 
vector L“ normal to both J“, K“ by the requirement 


L? = 14 Th IK 
In terms of the original triple of null vectors we find: 
L’= 7? 4 TI -Ka 


= abed_| (4 By)(Ac — Be)(Ca — Ad — B 
=n 2/3‘ b + Bp) (Ac c) (Ca d d) 
1 


= sat 2AvBe)(Ca ~ Aa ~ Ba) 


1 
= <a AGBeCa. 


/2 


We deduce that: 

The triple of the three linearly independent null future directed vec- 
tors A“, B“,C% defines in spacetime the right handed orthonormal tetrad 
(1%, J*, K%, L*). 

We note that this tetrad has been defined by the original pair of null vectors 
A“, B“ hence there are two more options to consider, that is the (A“%, C), (B%, C%). 

In terms of the orthonormal tetrad the Lorentz transformation associated for the 
observers cI“ is defined by the following matrix 


1 
b= (MJ, RL) = (A* + BY, A* — BY, C4 — A* — BY, — "4 A,B Ca). 


°Proof 


Lgl? =” nee lpteKal PE 


= (arocay 5? 345° + 86545? — 365° 54 4 5452 5¢ 548657) IIR IpJeKa 


= —8°8°84 1" SK TJ .-Kqa = 1. 


16.2 The Null Triad 597 


This expression indicates the “asymmetry” of the approach and also it does not 
associate a unique Lorentz transformation to the null triad A“, B“, C%. 

This problem we shall solve in next section. The following exercise concerns the 
inverse approach, that is, to define a null triad of three linearly independent future 
directed null vectors from a positive oriented null tetrad. 


Exercise 16.2.1 Show that to the NOT consisting of the positively oriented 
orthonormal tetrad I“, J“, K“, L“) with components 


I = (1,0, 0,0), J? = (0, 1, 0, 0), K% = (0, 0, 1, 0), L* = (0, 0, 0, 1) 
(16.17) 


we may associate the three null positive directed vectors of the null triad which in 
the tetrad frame have components 


A* = J/2(1, 1, 0,0), B? = V2(1, -1, 0, 0), C? = V2(1, 0, 1, 0). (16.18) 


This is the result which has been used by Allcock!® in order to deal with the 
“asymmetry” of Synge’s approach. However Allcock still uses the two null vectors 
approach therefore the “asymmetry” remains in the sense that the observer he 
defines (by his vector S“) is not the characteristic observer of the triple. 


16.2.3 The Characteristic Tetrad of a Null Triad 


Consider the three linearly independent null vectors A“, B“, C% of a null triad 
normalized as in Sect. (16.3). Define the four-vector 


—— 


D~ % 


It is shown easily that 77) is a unit timelike four-vector which defines the 
characteristic RIO for the null triad. This choice is completely symmetric in 
A“, B®, C% and obviously it is unique. 

We note that 


(A% + BY +C*%) (16.19) 


2 
Pkg 1 Be Taco ae 


that is, the frequencies of all three null vectors wrt the observer J a) are equal. 
We consider now the spatial parts Af = nea’, B= hn? B, ci = hac? of the 
null four-vectors of the triad wrt the characteristic observer / a): 


10ip. 


598 16 Null Triads and Proper Lorentz Transformation 


We compute: 


1 1 
AS = (66 + Ty Taye) A? = At — 3 tA" oR? C= 3 2A" = BP = 6% 
(16.20) 
1 
Bi =scc= 3 2B" — C* — A*) (16.21) 
1 
CL = 1 = gC" — A® — BY). (16.22) 


The length of the vectors A , B4, C4 is 


2 
ee een (16.23) 


For example for A4_ we have 


1 
A, =,/A%- At = qV Aa = By = COA? = Be = C4) 


1 


= 3Vv —4(AB) — 4(AC) + 2(BC) = ie 


For the sum we find 
Af + Bi +Cf=0 (16.24) 


which implies that the three vectors A‘ , BY, C{ are coplanar in the rest space of 
the characteristic observer J (1): 


We compute next the angle between the 3-directions A, , By, C, and find!! 


1 
cos@4,B, =cos04,c, = cos6B,c, =: (16.25) 


that is, the three vectors AS, BS, Co make equal angles with each other which is 
120°or =. 

From the above we conclude that the three spacelike vectors are the bisectors of 
an equilateral triangle which lies on the rest space of the characteristic observer I, (l) 


'lFor example for the angle 64, 3, we have 


Ad BY =5 2Aa Ba—Ca)-(2Ba—Ca Aa)=5 (AB) 2(AC)+(BC)+(AB)—2(BC)+(AC)] = : 


16.2. The Null Triad 599 


of the null triad. Obviously everything in this approach is symmetric and uniquely 
defined by the null triad! 

In order to associate an orthonormal tetrad with the observer J (1) we consider 
two normal spacelike vectors in the plane of the equilateral triangle. This choice 
is unique modulo a Euclidian rotation in the 2d plane. We chose the mutually 
orthogonal unit vectors to be one bisector and the opposite side of the triangle. 
There are three possibilities form which we choose the 


AQ ol 


=> FRA B* — C7’) (16.26) 
Bt -c4 1 
a = —_(B* — C%), (16.27) 


wn (Bi =Ct 2 


Exercise 16.2.2 Show that the vectors B{,C{ make the angle +30° with the 


direction of Ki): Be 


Exercise 16.2.3 Show that the null vectors A“, B% in terms of the vectors 
Tay, Jay: K (are given by the relations: 


2 

AY =F [ay + Joy] tes) 
T 

Bt =e [24a = Jay+ V3K ‘| (16.29) 
i 

Ctaye [240 Iii V3Ké)|. (16.30) 


In order to complete the tetrad we introduce the unit spacelike vector!* 


1 
Lay — 4 Typ JyeK aya = Fant AvB Ca 


This was expected. Why? 
'3The calculation of the length has as follows 


1 : 
Layakiy = 500 Narst Ab BeCaA’ BC! 


1 = 
5 (87855! — 8P8d8e + 858457 — 55808/' + 878037 — 31855?) AyBoCa A’ BSC’ 


ip 
=-5 (s:6¢57 42 54805) ) ApBcCa A’ BSC! 


1 
= S [(BA)(CB)(AC) + (CA)(AB)(BC)] 


— 


600 16 Null Triads and Proper Lorentz Transformation 


which is normal to the plane of the vectors A{ + BY + C{. Obviously the tetrad 
{Za), Jays K (1)? Li} is uniquely defined by the null triad modulo a 2d Euclidian 
rotation of the rest space of the characteristic observer [(1). 


fat) 


Summarizing, we have the following: 


A null triad {A%, B“, C%} defines a unique class of characteristic observers with 
four-velocity: 


c 
cl4 —— 


fy = gla" + BY + C4 


. The zero component of the null triad four-vectors wrt this observer are equal to 


2 (photons have the same frequency). 
3 P quency 


. In the rest space of the characteristic observer the directions A“ , B{, C4 of the 


three null vectors of the null triad are coplanar and lie along the directions of the 


bisectors of an equilateral triangle of side 2 . 


. In addition to the characteristic observer the null triad {A%, B“, C%} defines a 


characteristic spacelike direction with unit vector: 
a 1 abcd 
Lay = Va" ApB-Cga 


normal to the plane spanned by the directions A“, BY, C{ of the three null 
vectors of the null triad 


. In the spacelike 3-plane of the four-vectors A‘ , B4 , Ci we choose the unit four- 


3 1 : : 
vectors J(j) = j2de and K¢ zy (BY — C“) and define the unit spacelike 


qd) 
vector Li) so that the orthonormal tetrad {tI a) J a) K (1)? Li} is right handed. 
This tetrad is unique up to a rotation of the 2-plane spanned by the three four- 
vectors Af, BY, C4. 
This orthonormal tetrad has the highest possible symmetric expressions wrt 
the null triad because the vectors Bf , C{, make an angle +30° with the direction 
of K a): 


. The proper Lorentz transformation L? relating the NOT with the tetrad (i.e. the 


characteristic observer!) defined by the null triad is as follows 


16 = 16) = seia" + B°+C*) 
Li = Ji) = ea" — B“—C*) 
15 = Kiy = qin —c*) 
BSL = Sanh AyBeCa 


16.3. The Null Tetrad 601 
16.3. The Null Tetrad 


Another approach to the tetrads which is used in the classification of spacetimes 
and other problems related to the study of radiation is the null tetrad (sometimes 
referred as complex tetrad) which is a tetrad consisting of two real null vectors and 
two complex null vectors. The two real null vectors produce the timelike and one 
spacelike direction as it is done in the Synge’s approach and the two complex null 
vectors the remaining two spatial directions. The two null vectors are assumed to 
have opposite directions (that is one is future directed and the other past directed) 
hence they are normalized to +1 contrary to the previous considerations where both 
null vectors had the same orientation hence they were normalized to —1. 

Consider an orthonormal tetrad consisting of the four vectors vu", r”, x”, y” 
normalized as follows: 


uU"tyn = —-1, rrp = 1, x"X%, = 1, yyy = 1. (16.31) 


We note that uw is timelike and the rest four-vectors r“, x“, y“ are spacelike. 
These vectors form an orthonormal Lorentz basis in which the Lorentz metric is: 


Nab = —Uqlh + Vay + XaXb + Yayb- (16.32) 


Define the vectors: 


If = —(u* +r) (16.33) 


m? = —(r* — u* (16.34) 


We compute: 
"1, =0, m?mg = 0, I“mg = 1 (16.35) 


that is the vectors /*, m® are null and non-parallel (if they were parallel the /“mg = 
0). The inner product equals +1 because /“ is future directed and m* past directed. 
We compute the inner products 


xq =I yqg = m*xqg =m" yq = 0. 


Relations (16.33) and (16.34) are reverted to give: 


1 

Un = va — Mn) (16.36) 
1 

rn = (my + ly). (16.37) 


J2 


602 16 Null Triads and Proper Lorentz Transformation 


This proves that the vectors 1“, m“, x%, y® are linearly independent and form a 
basis in Minkowski space. 
We define the complex vectors: 


1 
t? = —~ (x" + iy" (16.38) 
men 
a 2 (xe? - iy’) (16.39) 
a Y 
from which follows 
1 i 
x° = —(t* +7"), y* = ~(f* -1£ (16.40) 
Ze ( 47). 9 ==) 
We compute 
1 
“ty = 5 ( —1+ 21x" Ya) =0 (16.41) 
tf 2 (iioae 0 (16.42) 
a 2 a : 
= 1 
14g = 5 Care = i?y*ya) = (16.43) 
Therefore the inner products are 
Iq =m" mg = ("ta =f" tg = 0 (16.44) 
I"mq = t"tg = 1 (16.45) 


We conclude that the vectors /", m"”, t”, ¢” form a basis for Minkowski space. This 
tetrad is not orthonormal and we call it a null tetrad. 

In order to compute the metric in a null tetrad we use (16.32) to replace u*, r 
in terms of /“, m“ and using (16.40) to replace x“, y“ in terms of t@, 7“ in (16.32) 
we find 


a 


1 1 
Nab = _) (mq — Iq) (mp — Ip) + 2 (mq + Ia) (mp + lp) + XaXp + Ya Yb 


= Maly + lamp + XaXp + Ya Yb 
=i mnt Bit). (16.46) 


16.3. The Null Tetrad 603 


Exercise 16.3.1 Compute the metric nap and the Levi Civita tensor density in the 
null tetrad frame 

Solution 

In the orthonormal tetrad frame u“,r“, x%, y* we have the components u% = 
(1, 0, 0, 0)’, r? = (0, 1, 0, 0)’, x” = (0, 0, 1, 0)’, y* = (0, 0, 0, 1)’. 

In this basis the components of the vectors of the null tetrad are: 


1 1 1 - 1 
14 = —(1,1,0,0)', m“ = —(-1,1,0,0)', f° = —(0,0,1, i)‘, tf = —(0,0, 1, -i)'. 
7! ys mt = a ) 7! i) 7! i) 
(16.47) 


The metric nap = 2l(amp) + 2tia tp). We compute the components 


Noo = 2lomo + 2toto =-1 


no. = fom, + limp +0 =0 
Working similarly we find 
Nab = diag(—1, 1,1, 1) 
The 
det nap = —1. 
For the Levi Civita tensor density we have 
Nabcal’m? t°F4 = det{I%, m%, t, #7} 


where I® is the first column of the matrix etc. In the null tetrad we get 


1-100 
fame = a se = 5-28) = —i. 
00 i -i 
Using the relation n?4 = —n® 5 n°4n@P Nrsqp and equation (16.46) we find that 
Nabed = —Alil[amptetay 


" ia = 414m? 174) 


604 16 Null Triads and Proper Lorentz Transformation 


16.3.1 The Proper Orthochronous Lorentz Transformation 
in Terms of the Null Tetrad 


The Lorentz transformation is an isometry of Minkowski space that is L'nL = 
n. The proper orthochronous Lorentz transformation is the Lorentz transformation 
with positive determinant which preserves the future light cone, that is the sign of 
the zero component of null vectors. 


In the following we compute the expression of the proper orthochronous Lorentz 


transformation i in terms of the vectors of a null tetrad. 


To do that we consider a null tetrad whose one null vector (the /% say) is an 
eigenvector!4 of i i.e. 


L(I*) = Al , AER (16.48) 


and compute the remaining actions L(m“), L(x“), L(y“) using the fact that the 
Lorentz products of these vectors are invariant. We write: 


Li(I*) = Al 


Lh (x*) = ox + Biy* + vil% + 54m" 


LLG?) = anx* + Poy” + pol” + d9m*% 
Li Gn) = 3x7 + Bay + ysl? + 53m" 


where the coefficients a, Bu, Yu 4 = 1, 2,3 have to be determined. In matrix form 
we have: 


AO 0 0 [2 
73 63 03 B3 m 
v1 51 ay By x 
v2 62 a2 B2 y 


—_ 
Ll= 


The invariance of the inner products 


xIqg =0, x*Xq = 1, x“ yg = 0, x*mg = 0 


'4The fact that such a null vector /“ exists is based on the property that “Every proper 
orthochronous Lorentz transformation leaves at least one null direction invariant’. The proof 
of this statement is as follows. al maps the future light cone onto itself and thus generates a 
continuous mapping of the unit sphere onto itself (which does not change the orientation of the 
closed curves on the sphere). According to the fixed point theorem of Browner there is always at 
least one point of the sphere which remains fixed under such a mapping, and the null direction 
which corresponds to this point is left invariant under the action of the transformation i. 


16.3. The Null Tetrad 


imply the relations 


605 


EGOL Gj=0, £1G9L Ga 1-111, = 0, 21 iE a0, 


We consider each inner product separately. We have: 
Vector x@: 


Lt @%Li 04 =0> 4 =0 


L) (x)L} (ya) = 0 = aa7 + BiB2 + 152 = 0 


LL! (wg) = 1 => of + B7 =1 


L) (@)L} (ma) = 0 = on03 + B1f3 + 1183 = 0. 


Vector y*: 
LOVE est &=0 
Li gy) L1 0%) =1> 02 + 62 =1 
Lo thm) = 0 => ana3 + BoB3 + 253 = 0. 
Vector m®: 
reer =1S 4521555 < 


2 
Lh mL} (ma) = 0 = 05 + B3 + Fy = 0. 


Equations (16.51), (16.54), (16.50) and (16.53) have the solution 


at + pr=l a,=fo=a 
a5 + py =1 — fp, = —a2 = —B 
a Bi + a2 62 = 0 where a*+ f* = 1. 


The rest of equations become: 


1 
_— — = @) 
aa3 — BBs + aM 


1 
Bag + aB3 + A” =0 


2 2 2 
03 + B3 + 73 = 0. 


(16.49) 
(16.50) 
(16.51) 
(16.52) 


(16.53) 
(16.54) 
(16.55) 


(16.56) 


(16.57) 


(16.58) 


606 16 Null Triads and Proper Lorentz Transformation 


From these we get 
a3 = J2y 
B3 = —V28 


and 
v1 = —V2A(ay + BS) 
yn = V2A(a6 — By) 


v3 = —A(y? + 6”). 


Therefore we have the following representation of the orthochronous proper 
Lorentz transformation py in the original tetrad: 


Lid) = al‘ 


L! (x*) = ax" — By? — V2A(ay + Bd)I4 


Li (y*) = Bx’ + ay" + V2A(@5 — By)I4 


1 
Lh (m4) = V2yx4 — V28y* — Aly? + 4 + 5m 


where a? + 6? =1 (S a=cos0, B =sin@). 
In order to compute jae in the null tetrad we use the linearity of the Lorentz 
transformation. We have: 


L* 4) = i (x%) + By (y*) 
+ JD + Wt + 
= s [ + ip)x* +i(a + ip)y* + V2A(a + iB)\(—y + isyl*| 
= (a + if)t* + Ala +if)(—y + id)i4 
= (a + if) [t* + A(-y + id)I*] 


Working similarly we find 


L1 @) = (@ — ip) [7 — A(y + i8)I"]. 


16.3. The Null Tetrad 607 
The 
1 
Lh (m‘) = V2yx4 — V28y* — Aly? + 8% + sin 
P : 1 
= y (t* +7") —i8 (# — 1%) — A(y? + 8°)I4 + ria 
I a Direc: 2\Ja . a +o) 7a 
= = |m — A2(y? 4.82) + A(y 4 i8)t* + A(y — i8)F | 


We define the quantities 
eV =A; e'? =a — if: € = —A(y — id) 


and the transformation relations for py become 


Li (4) = eV 14 (16.59) 
L1 @n*) = e-¥ (m* — cél* — 5t® — ei”) (16.60) 
Lh (t*) =e! (14 + el”) (16.61) 
Lh @) = e974 + €l4), (16.62) 


The matrix representation of ah in the null tetrad becomes: 


ev 0 O 0 14 

_e-Ven o-¥ —e-Vs eo ¥ a 

Bie. e "eee e re—e "sé m 
ae ee QO el? Q 14 here) 

els 0 O elf ra 


We note that we have two real parameters (the w, @) and one complex (the ¢). 
For ¢ = 0 the transformation is 


Lt?) = el (16.64) 
Lt ye int (16.65) 
Lt @?) =e (16.66) 
Laer. (16.67) 


and is a dilatation in the plane defined by the two null vectors /%, m“ and a rotation 
in the plane defined by the complex vectors f@, f”. 


608 16 Null Triads and Proper Lorentz Transformation 


For y = ¢@ = 0 the transformation becomes 


i =r (16.68) 
L} (m*) = m4 — e8l4 — 814 — ef” (16.69) 
Lh,¢% = 1% +64 (16.70) 
Lh) =i +41". (16.71) 


These transformations are called null rotations!> about the vector /“. The general 


transformation i is the product of a special Lorentz transformation and a null 


rotation. This product is not commutative. 


Exercise 16.3.2 Using the inner products of the vectors of the null tetrad show 
that the special Lorentz transformation and the null rotation can be written in the 
following canonical form 


(11 )o = eVigm? + eal? + e915, + ef F tg (16.72) 


-7 a 


(Et = he HEH ed et = Se. (16.73) 


In Exercise 16.3.3 it is shown that a null rotation can be written in terms of a null 
bivector. 


Exercise 16.3.3 Define the complex antisymmetric tensor 


Fay = 28ttalp) (16.74) 


1. Prove that this is a null antisymmetric tensor, that is, satisfies the property'® 
Fa FO? = 0. (16.75) 
2. Prove that 
FacF? = —eélgl’ (16.76) 


3. Then prove that a null rotation Gy can be written as follows 


(L3,)8 = (ac + Fach(n® + F), (16.77) 


'SMore information on the null rotations can be found in early papers such as. 7. R.K.Sachs Proc. 
Roy. Soc. (London) (1961),264, 369; 2. H. Bondi, F-A.E. Pirani and I. Robinson Proc. Poy. Soc. 
(London) (1959), A251, 519; S. Bazanski J. Math Phys. (1965), 6, 1201. 


'6 A bivector which satisfies this property is called a null bivector. 


16.3 The Null Tetrad 609 


4. Prove the relations 


Fypl? = Fypt” =0 (16.78) 
Fapm? = tg, Fapt? = —%lg (16.79) 


16.3.2 The Special Lorentz Transformation in Terms 
of Bivectors 


In this section we show that the special Lorentz transformation i , can also be 
written in terms of bivectors. We have the following result. 


Exercise 16.3.4 Define the simple bivectors fay = 2lamp, and f*, = 2i tlatp) and 
using the inner products of the tetrad vectors prove the relations 


oo ae = 2l(aMc) (16.80) 
far f? = —2, fi, f°? =2 (16.81) 
fii j7= 4 (16.82) 


City S==iP (16.83) 

favl” = (lamp — Ipma)l? = la 
fap? = (gmp — Iyma)m? =—mM,g 

fi,t? = i(taty — totg)t? = ity 

fit? = iltaty — thig)?? = —itg 
Proof 
fav fc = lamp — lpm) (I?m, — em”) = ame) 
Fo f° £5 = 0% Urme + lom,) (l’mg — mig) = ?mg —lgm* = f9 

Similarly we prove the remaining relations. 

From Exercise 16.3.4 we deduce the following: 


a. Relation (16.81) means that the bivectors fa» and f*, are not null bivectors. 


610 16 Null Triads and Proper Lorentz Transformation 


b. Relations (16.82) and (16.83) mean that all odd powers of the bivectors fa, and 
“, reduce to the bivectors themselves and all even powers to products of the 
bivectors. 
c. The remaining relations mean that the tetrad vectors /,, mp, ta, tp are eigenvectors 
of the bivectors fap and f*, respectively. 


We write now the special Lorentz transformation in terms of powers of the 
bivector fi». Due to property b. it is enough to consider the following general 
expression 


cay: =O tarfrotarfrcfntasfe? +asfe fr 


where a, 6 = 1, 2,3, 4 are real numbers. In order to calculate the numbers a, we 
use relations (16.64), (16.65), (16.66) and (16.67) and have 


Lt.) =e¥l* =A +a1+@)l" (16.84) 
Li (m*) = em = (1 — a + ay)" (16.85) 
Lt?) = e114 = (1 — ia — aa)t" (16.86) 
Lt @) = el7 = (1 + iay — aa)i" (16.87) 


from which follows the solution: 
a, = sinh y, a2 = coshy — 1 
a3 = sing,a4 = 1—cos@. 


Therefore in the bivector basis the special Lorentz transformation is the following 


(Lh) = 88 + sinh rf + (cosh y— 1) fe f°n + sin bff + (1 —cos #) f2" ff. 

(16.88) 
Exercise 16.3.5 Using that in the tetrad basis the metric nap = 2l(qmp) + 
2t(atp) prove that 


(L},)%» = sinh Wf, + cosh fc f*s + sind fj" — cos@ fe" ff. (16.89) 


Chapter 17 ®) 
Geometric Description of Relativistic on 
Interactions 


17.1 Collisions and Geometry 


There is a fundamental difference concerning the concept of particle in Newtonian 
and in relativistic Physics. In Newtonian Physics a particle is a ‘thing’ which has 
been created once and since then exists as an absolute unit for ever. Concerning the 
physical quantities associated with a particle they are divided in two classes. The 
ones which are inherent in the structure of the particle such as mass, charge etc. and 
characterize the identity of the Newtonian particle and those which depend on the 
motion of the particle in a reference system such as velocity, linear momentum 
etc. Newtonian particles are assumed to interact by collisions creating larger 
systems. This interaction of particles happens in a way that the overall inherent 
quantities of the particles are conserved (i.e. mass, charge) while some of the motion 
dependent physical quantities such as energy, mass and linear momentum etc. are 
also conserved. Finally the systems consisting of many particles have additional 
physical quantities such as temperature, pressure etc. 

In Special Relativity the scenario is drastically different. A relativistic particle is 
not a ‘thing’ but a set of physical quantities which share the same frame as proper 
or as characteristic frame depending if they are timelike or spacelike respectively. 
In this sense a ‘particle’ can appear for example as a neutrino v or as three particles 
p,e , Vg according to the reactionn — p-+e~ + vg. In this sense an electron is a 
set of two scalars (mass, charge), one vector (spin) and other physical fields making 
up a catalogue which is not necessarily complete. That is, it is possible that some 
experiment will indicate that the electron has associated a new physical property 
which we do not know and we have to consider it. This new quantity will not change 
the concept of the electron; however it might lead us to consider more types of 
electrons (as we do with the positron) but this is it. In the relativistic approach we 
have discovered only partially the inherent entities of creation and we should be 
open to new discoveries. 


© Springer Nature Switzerland AG 2019 611 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_17 


612 17 Geometric Description of Relativistic Interactions 


As in Newtonian theory in Special (and General) Relativity the relativistic 
particles besides their inherent physical quantities have also physical quantities 
which depend on their motion in a LCF. These quantities are the four-velocity, 
four-momentum etc. Furthermore they are allowed to interact creating larger 
systems which, as in Newtonian Physics, introduce new physical quantities such 
as (relativistic) temperature etc. During their interaction we assume that some of 
the inherent and the non-inherent physical quantities of the particles are conserved. 
For example the total charge and the total four-momentum we assume that are 
conserved. 

It becomes clear that in Special Relativity one is possible to look upon a 
relativistic interaction as a transformation of a set of Lorentz tensors (scalars, four- 
vectors, tensors) to another set of Lorentz tensors (scalars, four-vectors, etc.) which 
may differ both in number and in type which preserves the total amount of each 
of these quantities. This view is useful and important because it makes possible to 
geometrize particle interactions. One might ask: 

Why should we want to geometrize relativistic interactions? 

The answer is simple: 

Because if we manage to do so, then we gain twofold, that is: 


a. We shall be able to ‘explain’ various relativistic results and show that what is new 
it is not their mathematical expression but the physical explanation we have given 
to mathematical expressions. In this sense we let ‘Physics justify Geometry’. 

b. It would be possible to produce new results which will be consistent mathemati- 
cally and possibly lead to new physical phenomena we have not think of yet. In 
this sense we let ‘Geometry propose Physics’. 


In this chapter we discuss relativistic reactions in the above sense. However we 
shall restrict our considerations to systems of four-momentum only, mainly because 
at the level we work this is the physical quantity we are most interested. However the 
approach is otherwise general so we shall speak of interacting four-vectors which 
need not be four-momenta! 


17.2 Geometric Description of Collisions in Newtonian 
Physics 


From the previous discussion it is clear that behind the physics of relativistic 
collisions there exists a geometry, which would be interesting and useful to be 
recognized and studied. The same holds for Newtonian collisions, although this is 
not widely known and emphasized. In this section we discuss briefly the Newtonian 
case for a simple model and in the next section the consider the relativistic 
counterpart. 

We consider two similar smooth (solid) spheres with masses m ,, m2 which are 
moving along the x—axis with velocities v;, v2 respectively. We assume that at 


17.2 Geometric Description of Collisions in Newtonian Physics 613 


some moment the spheres collide centrally and after their collision they move again 
along the x— axis with corresponding velocities v|, v5. Newtonian conservation 
laws of linear momentum and energy give: 


my (vy — vy) = m2(v2 — V5) (Conservation of momentum) (17.1) 


my (vy v?) = m(v5 uv?) (Conservation of energy). (17.2) 


Combining these two equations we find: 
Vp — v2 = U5 _ v} (17.3) 


that is, the relative velocity of the spheres is preserved during the elastic collision. 
Solving (17.1) and (17.2) with respect to vj, v' we find: 


> ewe 2k 
Uy = i=e fee (17.4) 
. 2 k—1 

UV, = Tie Te (17.5) 


where k = m7/m ,. These equations can be written in the form: 


“ 14k 2k i 
(et a) (17.6) 
U9 Trek T+tk v2 
Uv, v 
( |) =m( ') (17.7) 
Uy v2 


or 


where M is the matrix: 


Ik 2k 
M=| 5 FE, |. (17.8) 
I+k I4+k 


The matrix M contains all the information concerning the collision and it is the 
geometric expression of the laws of conservation of linear momentum and energy. 
In order to understand the geometric significance of the matrix M (equivalently, 
of the laws of conservation of linear momentum and energy), we interpret M as 
a transformation matrix in a linear space in which we interpret geometrically the 
collision. We assume that in the specific example this space is an Euclidean two- 


: : v : 
dimensional space, whose vectors are of the form ' ). In this space the state of 
v2 


614 17 Geometric Description of Relativistic Interactions 


‘ : bes v 
the system of masses is described by the position vector ( ! 
v2 


) which after the 


/ 


eee v ; Ase 
collision is transformed to the vector ( : ) Therefore, in that space, the collision 


v 
2 
is described by the linear transformation: 


We note that geometrically the collision is not characterized by the velocities, but 
with the matrix M which is defined in terms of the ratio k = ma of the masses. This 
observation is very important and has many consequences among which we note the 
following: 

v1). et ogre itt =e % 
1. If ( - is another initial point (i.e. different initial velocities), then for the same 
v2 

masses and type of collision we can write the result of a central collision without 

any further calculations, as the image point under the linear transformation M, 


that is: 
- _ 
Vv V1 
“} | =M (: ) 
Ug U2 


2. It is possible to study qualitatively this type of collision without any reference to 
velocities, by studying the geometric properties of the transformation matrix M. 


The second point is more important, because it geometrizes the physical process 
of collision and allows us to employ the powerful mathematical methods of 
Differential Geometry in the study of the problem. Of course the problem we 
considered is simple but the method can be generalized to more complex problems 
and make possible the study of physical problems, which otherwise would be 
very difficult to do. More generally, the geometrization of Mechanics is a long 
standing study which recently has been extended to the study of dynamical systems, 
which involve practically most branches of modern science ranging from physics to 
economics and medicine (chaos, Hamiltonian dynamics etc.). 


Exercise 17.2.1 Show that the transformation matrix M satisfies the following 
geometric properties. 


. M is symmetric, if and only if m, = m2. 

. traceM =0 

. detM =1 

M? = I or M~! = M (M isa projective operator) 
. diag(m,, m2) = M'diag(m,, m2)M. 


AR WN 


17.4 The General Geometric Results 615 
17.3 Geometric Description of Relativistic Reactions 


The geometric description of Newtonian collisions cannot be transferred to Special 
Relativity as such, for the following two reasons: 


1. The photons have zero mass, therefore their description is not possible in a 
collision matrix containing masses. 

2. The number and the identity of the reacting particles in the relativistic inelastic 
collisions is not preserved. This means that the collision matrix is not square 
hence there is not inverse. This implies that the relativistic collision cannot be 
seen as a transformation matrix of masses in a properly dimensioned linear space. 
However, this might be possible for certain elastic collisions. 


There are ways we can circumvent the first point and we can deal with the second, 
but the required methods are new and outside the simple Newtonian approach 
described in last section. 

However this does not worry us because as we explained relativistic reactions 
should be understood as transformations between sets of timelike four-vectors under 
the constraint of conservation of their sum (=conservation of four-momentum). 
Therefore the geometrization of relativistic reactions will be realized by the 
establishment of geometric relations among the momenta of the reacting particles 
and possibly some momenta of the produced particles. This approach will be 
realized in two stages; The first stage involves the expression of the relativistic 
quantities in terms of the invariants build from the 4-vectors (mainly their length 
which corresponds to the masses of the particles). The second stage concerns the 
description of the reaction with a matrix of 3-momenta. In the following, we shall 
deal briefly with the first stage, because the detailed development of both stages is 
involved and beyond the level of this book. 


17.4 The General Geometric Results 


Before we discuss the details of relativistic systems we consider some general 
geometric results which shall be used. The main result is the following Theorem 
(see also Proposition 1.11.1). 


Theorem 17.4.1 The sum of a set of future directed timelike and/or null four- 
vectors is a future directed timelike four-vector except if and only if all four-vectors 
are null and parallel in which case the sum is a null four vector parallel to the null 
vectors. 


Proof Let Alyy, ay Ais a finite set of future directed particle four-vectors. Then we 
have: 


Aly >0, AlpjAda <9 I=1,..,n. 


616 17 Geometric Description of Relativistic Interactions 


n 2 n 
b» (+) = (At,) +2 ys At A(a (17.9) 


I=1 1<I<J<n 


2 
where (4¢,) = Aly Aa: 
We consider the terms in the rhs. For the first term we have: 


n 7 2 
yy (4%) 0 
I=1 
where the equality holds if and only if all An are null. 


Concerning the second term let us assume that there exist a Atn which is 
timelike. In the proper frame %(A(;)) of A> the components of A(n are: 


0+ 

a Ac 
1) ~ 0 : 
=(A(7)) 


Let AC 7)A(J)a an arbitrary element in the second term and assume that in 2 (A(1)) 
the four-vector AC D has components: 


a Aly 
At, = (17.10) 
Ay 
x(A()) 


Then in X(A(p) we have: 


At. Acs | = —A%t 4, <0 (17.11) 

[ OPM Ves c4ep) A) 

because the four-vectors are future directed. But Aly AC. D is invariant, therefore: 
A(n Aa <0 

in all LCF. 


Working similarly with the rest of the terms which contain At we show that: 


n 
2 Aly Aa < 0. 
J=1 


17.4 The General Geometric Results 617 


But AC D is an arbitrary timelike four-vector, hence: 


¥ Aly Aya < 0. (17.12) 
1<I,J<n 
We conclude that if all four-vectors Ath: I =1,...,n are not null, then: 


n 2 
b» (si) 20 (17.13) 


T=1 


that is, the sum is a timelike four-vector. 
We assume now that all four-vectors At: I = 1,...,n are null. Then 


: 
= (4%,) = 0 and furthermore: 


AAW = 


1<I<J<n 


=0  ifandonlyif A4)//...//A%, 
<0 otherwise ; 


The first conclusion is profound. In order to prove the second we note that 
in an arbitrary LCF & the arbitrary (future directed) null four-vector A) has 
components: 


1 
A® = Eu ( ) I=1,..,n 
dd) ) e”) 


where E,;) > 0. Therefore in & we have: 
Aly Ana = Ey Ew (-1 + ea ey): 


But e(7) - e¢y) < 1 and “=” holds if and only ifeg) = ey, J,J = 1,...,n. 
This implies At // AC jniJg= 1,...,2 which proves the second assertion and 
completes! the proof of the Theorem. Oo 


'Why we do not have to consider the case that one subset of four-vectors is null and the subset of 
the remaining four-vectors is timelike? Is this case covered? 


618 17 Geometric Description of Relativistic Interactions 


17.4.1 The 1 +3 Decomposition of a Particle Four-Vector wrt 
a Timelike Four-Vector 


In Sect. 12.2.2 we discussed the 1 + 3 decomposition of a vector wrt a unit timelike 
vector. In this section we slightly generalize this discussion and consider the 1 + 3 
decomposition of a particle four-vector A’ wrt a timelike four-vector B' which is 
not necessarily unit. We start again form the identity: 


Ai =siai = (si 4+ — is, ) ai — — Bi (Bai (17.14) 
ean pees (nat | B2 J B2 J , 
and write: 
Al = Ai + Al (17.15) 
where: 
Al eens B,A!) B' 17.16 
|= ~ Bz Pi (17.16) 
Als (« + 588) Al (17.17) 
LT \ei " B2 J : : 


The four-vector Aj is the parallel component and the four-vector Ai, the normal 


projection of A’ along B'. The tensor: 
i i Ls 
hi (B) = 6) + Ri? B; (17.18) 
is the projection tensor associated with the four-vector B’. We have: 
Al =hi(B)A!. (17.19) 
As we have seen the 2-tensor hi, (B) is very important in Relativity and it is used 


extensively in all calculations. In the rest frame © of B! the components of hi, (B) 
are: 


hi.(B) = diag(0, 1,1, Ds, (17.20) 


and can be thought as the Euclidean metric of the rest space of B’. It satisfies the 
properties: 


hij(B) = hji(B), hi:(B)h] (B) = hi(B), hi(B) = 3. (17221) 


17.5 The System of Two to One Particle Four-Vectors 619 


Summarizing we have: 


=| 
At = i (Av B?) B + h3(B)A? (17.22) 
and in matrix form2: 
_ APBy 
At = ° 
h(B)avA? } 5 


Having given the basic facts about the decomposition of four-vectors we are 
ready to discuss the geometry of systems of particle four-vectors. We emphasize 
that the results we shall obtain in the following apply to any type of interacting four- 
vectors and not only to the four-momentum four-vector. In the rest of the chapter we 
shall consider two special four-vector systems. The system A+ B — C which is the 
generic reaction and the system A+ B — C+D which is more general that the first. 
The results we shall obtain will be generic in the sense that they will apply to all 
reactions of the type we consider. We shall illustrate the results by specific examples 
which demonstrate how one applies the general ideas in practice. Finally it should 
be remarked that the results we derive are all covariant, therefore it is possible to 
develop proper software (for algebraic computing) which will give the answer to 
any problem for sufficient given data! 


17.5 The System of Two to One Particle Four-Vectors 


The simpler system of particle four-vectors is the system consisting of two future 
directed particle four-vectors A“, B“. and corresponds to the generic reaction A + 
B — C. From these two four-vectors we define another two four-vectors the A% + 
B“, A® — B“. The four-vector A“ + B® is a future directed particle four-vector 
called the Center System (CS) four-vector (see Sect. 1.12). In case A“, B“ are four- 
momenta the four-vector A“ + B® is the Center of Momenta four-vector and the 
particle it defines is the center of momentum particle. 

We shall express the inner products between the four-vectors A“, B“ in terms of 
their lengths. The results will be used to compute the zeroth component of A“ in the 
proper frame of one of the rest four-vectors. The zeroth component, equivalently the 
inner product A“ By, is computed form the identity: 


(A? + B)* = (A*)* + (B*)* +2A°B, (17.23) 


The unit in the direction of B“ is B“/B hence the B in the denominator. 


620 17 Geometric Description of Relativistic Interactions 


if we replace the lengths: 
(A% + B®)? = —M?, (A%)* = —A’, (B22? =—-B* =(M,A,B>0). 
(17.24) 
We find: 


1 
ANB = 5 |-“ 4 Ar 4 BI. (17.25) 


Concerning the spatial part of A“ in XB we have: 


(A? By) 


ene ae ae we “Sr Ba = Aa + a5 [-”? 4 Ar 4 B] By. (17.26) 


The spatial part can be decomposed further in the unit spatial direction Ay B) and 
the length of Avg) in Uz: 


Aca) = Vhap(B)ATAPA(R). (17.27) 


Finally for the four-vector A“ we have the decomposition: 


M2— A2~— B2 
2B 
At = (17.28) 


Viap(B)A®A® Ap) 


We see that the four-vector A“ is determined from the invariants A, B, M and 
the unit spatial direction A,g). 


Exercise 17.5.1 For every no null four-vector A! the symmetric tensor hij(A) = 
1 


Nij — area, AAI projects normal to the vector A’, that is hij (A) Af =0. 


a. Let p! the four-momentum of a particle and pi the four-momentum of another 
particle. Show the identity: 


eel 
io: PiP. 4 i j 
P= p +h’ (p)p- 
1 pk pk I 


This identity defines the 1 + 3 decomposition of the four-vector pi wrt the 


i 


é ‘ J ni 
four-vector p'. The part Pi = ae is the parallel part and the part 
Pit = h'(p)P} is the normal part. Show that the inner product pi pi = 
5{(p! + pi? — pip — pi pik that is, it is expressed in terms of the length of 
the four-vectors. In order to give the above a physical interpretation we consider 


the masses p' p' = —M?c?, P\ Pi = —mic? and then the inner product Pt pi= 


17.5 The System of Two to One Particle Four-Vectors 621 


—ME*, where EP is the energy of the particle of four-momentum pi in the 
proper frame & of the particle with four-momentum p'. Define p, = p' — Pp} 
and show that: 
2 2 2 
m; —M*—m 
EP pare) 1 Ce 
2M 

where py = —myc*. Also show that the length of the normal part BiDiic is 
given by the relation: 


2 22 2 
aa ms — (M + m1)*|[m5 — (M—m 
Pi Pi = pic? = rae =—mic = [ 2 ( ve 2 ( v*] & 


Finally collect the results of the 1 + 3 decomposition of the four-vector Pi} in & 
as follows: 


2 2 2 
m5—M*—m; Pe) 


2M 
p= (Flr) - 
Pl x Vim} (M+m 1) [m3 (M—m)?] zr 
IM ce yp 


where e® is the unit of the spatial part of Pi in Xd. 


17.5.1 The Triangle Function of a System of Two Particle 
Four-Vectors 


The length hap(B) ASA” of A,g) is an invariant therefore it is a characteristic 
quantity of the system of the four-vectors A“, B“. In order to determine the exact 
dependence of this quantity on the four-vectors A“, B“ we compute it. From (17.26) 
we have: 


i 
Nap(B)AZA® = {4« + sa |-”? 4 AP + B] Ba Aa 


1 
= apt (M, A’, B’) (17.29) 

where: 
(M2, A2, B?) = /[-m2 + A2+ B2]? — 4A2B? (17.30) 


— /M4 + A4 + B4 — 2M2A2 — 2M2B2 — 2A2B2. 


622 17 Geometric Description of Relativistic Interactions 


We conclude the following: 


1. The function 4(M7, A”, B7) is an invariant depending only on the lengths of the 
vectors (A + B)*%, A%, B®. 
2. The function 4(M7, A”, B?) is symmetric in all its arguments. 


The function 4 we have met before when we were studying the collision A+ B > 
C. Here we simply recover it in a more general set up and, furthermore, we give its 
covariant geometric meaning. The properties of the A given in Exercise 10.6.1. 


Example 17.5.1 Show that the quantity: 


E= was y, Z) (17.31) 


equals the area of a Euclidean triangle of sides ,/x, ./y, ,/z. Prove that the triangle 
inequality of Euclidean Geometry assures that in a Euclidean space A(x, y, z) < 0. 
Solution 
The semi-perimeter t of a Euclidean triangle of sides a = /x, B = /Y, y = JZ 
equals tT = (a + 6 + y)/2 and the area is given by the Heron’s formula E = 
Jt(t — a)(t — B)(t — y). We find: 


E2 


1 
ge tatycet+B+yia—B+y)\a+s—y) 


ag (2? + B+ )") (a? - 8-7?) 


1 

=-3 (e-W7+-vd") (x- W9- v2") (17.32) 
1 2 

= ara (x, y, Z). 


Because in Euclidean Geometry the area E > 0 it follows that A(x, y, z) < 0. 
This creates a contradiction because in Minkowski space A(x, y, z) equals the 
measure |A*| which is positive, hence in Minkowski space A(x, y, z) > 0! However 
there is no problem because in Euclidean Geometry the triangle inequality implies 
that A(x, y,z) < 0 and in Minkowski space the same inequality assures that 
A(x, y, Z) = 0. To prove the later we consider the four-vectors A“, B“, (A + B)*% 
and from (17.32) we have for x = (A+ B)*, y = A”, z = B?: 


cA + By — (V/A? + ve) | cA +B)? — (Va? - vB) | >0 


(A+ By? > (Va? +VB2) , (A+B)? > (VA2— V2) 


or 


17.5 The System of Two to One Particle Four-Vectors 623 


(A+ By < (Va? +VB2) , (A+B)? <(VA2— VB?) 


But VA2 + /B2 > VA? — VB? because A, B > 0. Hence: 


(A+B) > VA2 4 VB? (17.33) 


for all pairs of particle four-vectors A“, B“. Therefore the condition: 
(M7, A*, B?) > 0 


is satisfied and it is equivalent to the triangle inequality in Minkowski space. 

In words relation (17.33) means that the particle four-vector (A + B)* is 
something “more” that the aggregate of the particle four-vectors A%, B®. This 
“something” is the structure which couples the two four-vectors into the system — 
particle four-vector (A + B)“. To see what this implies we examine its effect in one 
well known issue of relativistic Physics, the mass loss. We assume the four-vectors 
A“, B“ to be four-momenta. Then (A+ B)* represents the center momentum particle 
whose mass is M. Then inequality (17.33) implies that M is larger than the sum of 
the masses m4, mg of the individual particles A“, B“, the difference counting for 
the potential (or internal) energy of the particle (A + B)*. 

An interesting special case is A(x, y,z) = 0. Then M = A+ B and |Ay| = 
|Biz| = 0, which implies that the proper frame of the four-vectors A“, B“ coincides 
with the proper frame of M“. The condition A(x, y, z) = O we call threshold of 
the interaction of the four-vectors A“, B“.Note that the above do not apply only to 
four-momentum but to all interacting triples of four-vectors A“, B“, (A + B)*. 


17.5.2 Extreme Values of the Four-Vectors (A + B)* 


The lengths (A + B), (A — B) of a system of two particle four-vectors A%, B“ of 
length A, B find application in many cases and especially in relativistic reactions 
(collisions) ( (A + B) is the mass of the Center of momentum particle and (A — B) 
is the amount of transfer of four-momenta). It is of interest to determine the extreme 
values of the quantities (A + B), (A — B) when the direction of A“, B* changes 
while their length remains constant. 

From (17.23) we have —(A+ B)” = —A?— B? + 2A“ Bg therefore the extremum 
of —(A + B)* occurs when the term A“ B, is an extremum. The term A“B, is 
invariant therefore it is possible to be computed in any LCF. We choose the proper 
frame X4 of A®% and write: 


0 
a= (4) ja = (Fe | 
0/5, Buy). 


624 17 Geometric Description of Relativistic Interactions 


from which follows: 


A‘ B, = —ABY,). 


But B° 


2 ; 
(A) = /B2+ Bi) hence: 
A“ By = —A,/B? + B2,.. (17.34) 


We note that in the rhs the only quantity which changes is Be Ay therefore the 
extremum (maximum) of A“ B, occurs if: 


0(A“Ba) a 
3Biy) 


This condition gives: 


Bua) 


/ R2 2 
BY + Boa) 


—A =0 > Bia) =0 


hence: 
(A‘ Ba) max = —AB. 
It follows: 


(A+B), = 47+ B?+2AB = (A+ BY’ 


(A — B)2... = A* + B* —2AB =(A-— B)*. 


min 


Condition B(4) = 0 means that the proper frames of the four-vectors A“, B@ 
coincide or A* = aB“, where a is an invariant. To compute a we multiply this 
equation with A® and find: 


) a a = _ 4 
—A*=a(A By) ==GAB > a=. 


Therefore the condition for the extremum is?: = = cig or A“//B*. 


3It is possible to compute the extremals of the quantity (17.34) without any calculations if we note 
that the quantity under the square root is non-negative. Therefore the maximum value occurs for 
the minimum value of the denominator which is Bi ‘A) = 0 and the minimum when the denominator 


is maximum, that is, when Bix is infinite. 


17.5 The System of Two to One Particle Four-Vectors 625 


17.5.3 The System A‘, B“, (A + B)* of Particle Four-Vectors 
in CS 


Let A“, B® two particle four-vectors which are not null and parallel and let &* be 
the CS of the system of A“, B“. The components of the four-vectors in CS we shall 
denote with an asterisk e.g. for the vector A“ we write: 


AO* 
ie ( oe ) (17.35) 
pe 


In order to compute A™ we note that in 5* (A + B)* = (*) hence: 
y* 


A%(A+ B)qg = —A™M. (17.36) 


But from (17.25) we have if we replace B with M (why we can do this?): 
a 1 2 p) 0) 
AN(A + B)y = 5 |—M? — A? + B?] 


from which follows: 


1 1 
— AMM = 5 [-™? — Ar + B] => Am = [we +A? — BI. (17.37) 


Concerning the length of the spatial part A* from (17.29) we have: 
1 
(A*)? = hap(A + B)A7A? = aae® (M A’, B*), (17.38) 


It is instructive to compute the components of the four-vector A“ in the CS 
directly by making use of the decomposition (17.28). To do this we write (A+B)* = 
A®% + B®? as —B* = A*% — (A + B)*, which shows that —B“ is the CS of the 
four-vectors A“ and —(A + B)*. Therefore relation (17.28) applies if we make the 
correspondence: 

M<>B 
up<— d* 


Acs) <— —A* 


0 0 
A(p) > —A e 


626 17 Geometric Description of Relativistic Interactions 


It follows: 


M2+.A2—B2 
2M 
At = (17.39) 


sy A(M?, A, B?)A* 
Concerning the decomposition of B“ we have from (17.39) if we interchange 
A <> Band note that A* + B* = 0: 
M2—A2+4 B2 
2M 
B= ‘ (17.40) 


1 2 2 WA 
—xh7h(M?, A, B)A* } 


In order to check our results we compute the angle 0% OAB between the vectors 
A*, B* in X*. Obviously we expect to find 6% , = since A* = —B*. We compute: 


1 
A* . BY = hap(A+ B)ACB? = -ppr mM, A”, B?) = |A*||B*| cos 0%. 


But |A*| = |B*| = 3474(M?, A?, B?) therefore cos 04, = —1 > 04, = 7. 


17.5.4 The System A‘, B*, (A + B)* in the Lab 


The LCF in which we study the “motion” of systems of particle four-vectors are: 


(a) The CS 

(b) The proper frame of one of the particle four-vectors, which we call the target 
system 

(c) The Laboratory System (lab) which usually coincides with the proper frame of 
one of the particle four-vectors 


In Sect. 17.5.3 we studied the system of two particle four-vectors A“, B“ in the 
CS. In this section we study the same system in the lab which we assume that it 
coincides with the proper frame of particle B®. 

We denote the components of a four-vector in the lab with an L and write: 


aif a" gf B a ((A+B)% 
aa (Ar) ao) ema (A ime) a7 


In order to compute the components A°”, (A“)* we apply relations (17.37) 
and (17.38) provided we change A + B with B. From (17.37) we find: 


1 1 
AM = [-2 Aes. M?] == [8 4+A2— mM? (17.42) 


17.5 The System of Two to One Particle Four-Vectors 627 
and from (17.38): 
1 
(AY)? = hgp(B)ACA? = apie (Me, A’, B’) (17.43) 


where we have used the fact the triangle function is symmetric in all its arguments. 
Let us assume that ©* is moving wrt ©’ = Dg with velocity 6*. Then we have: 


oe (5) _ 2M 
0 & 
er xi A(M?, A?, B?)B* 


The two expressions are related with the Lorentz transformation which relates &*, 
ZL. Therefore (see (1.74)). 
M?— A? +B? M? — A? + B? 


=yB eee eee aS 17.44 
aM yB>y aBM ( ) 


1 ~ (M2, A?, B*) ~ 
——)(M?, A*, B*)B* = yBB > B = ——_"___- B* 17.45 
au ) vB B IMyB ( ) 
Let us consider the application of the above general results in special two particle 


systems. 


Example 17.5.2 Consider an electron and a positron with four-momenta p?_, p%, 
respectively. Assume c = 1. 


1. Determine the energy of each particle in the CS. 

2. Show that in the CS the spatial momenta are antiparallel. 

3. Assume that the positron rests in the lab and compute the velocity of the CS in 
the lab. 


Solution 

Let m,- = m,+ = m the masses of the particles involved. The four-momentum 
of the Center of momentum particle is (p.- + Pe+)“ = pf_ + pé, and let its mass 
be M. Identifying A“, B“ with p?_, p?, respectively we find from (17.39): 


M?2—m?+m2 M 
2M 2 
p= = (17.46) 
ay (M?, m?, m? \p i : M2 — 4m?p*_ su 
M?2+m?—m? M 
ii 2M 2 
p= = _ (1747) 


1 =~ 
x7 (M*, m?, m?)p*, Sh —3V M? — 4m?ps_ ~ 


628 17 Geometric Description of Relativistic Interactions 


The energy is the zeroth component of the four-momentum, hence: 


The 3-momenta are the spatial part of the four-momentum. It follows: 


1 
[pe-| = Ipe+l = =v M? — 4m?. 


2 


Concerning the angle 6°, between the two 3-momenta in the CS we find 6%). = 
I. 
From (17.44) we find that the y— factor of the CS in the lab (=proper frame of 


Det) iS: 


_ M2 — m2 +m? _ M 
eo 2mM ~ Om" 


Finally from (17.45) we find for the B*— factor (c = 1): 


_ (M7, m7, m7) ee ac VM? — 4m? _, 


2Mym . M Pe 


pB* 
The 3-vector p is a space direction in the lab which shall be defined from the 


initial conditions (it is the direction of the bullet particle in the lab). 


In the next exercise we compute the same results using direct calculation, so that 
the reader will gain experience with this type of problems. 


Exercise 17.5.2. Assume p?_ = A“, po, = B*, A? = B* = m? and that the 
length of the momentum of the Center of Momenta particle is M. Verify the following 
calculations: 


a. Energies: 


a gee ge 
b. 3- momenta: 
h astb _ 2/2 1 2 2 2) 2 _ M? 2\ 2 
ab(Pe- + Pet) P,- P,- = —m'c + a M*—m*+m*\c ae )e 


17.5 The System of Two to One Particle Four-Vectors 629 


Cc. 


Angle: 


1 
hav(Pe- + Pet) Po- Per = — [u - 4m? M?] C=-7 [we - 4m? | ee 


4M? 


= —|pz-||pz+| = [pe-||pzs.| cos z. 


Example 17.5.3 Inthe LCF & the null four-vectors A“, B“ have components Ag = 
A(1,1,0,0)5, Ba = B (1,90, 1, 0)s. 


1. 


2, 


If A*%, B® are four-momenta of photons determine the energy and the direction 
of motion (the speed is known!) of the photons in the LCF &. 

A particle P of mass 3 moves in the plane x, y of © with factor 6 = 5 ina 
direction which makes an angle 45° with the x—axis. Determine the energy and 
the four-momentum of particle [ in &. 


. Compute the angle between the direction of motion of the photons A“, B“ in the 


proper frame of the particle I. 


Solution 


. The energy of the photon A® in © is: 


SE/c=A>% E= Ac 
and the 3-momentum: 


1 
0 


The direction of motion of the photon A® in © is: 
1 
2P= | 0 
0 


Prove that the photon B“ moves in © in a direction normal to the direction of 
A’. 


. The y— factor of the particle T in & is: 


1 2 1 
Ve page 


630 17 Geometric Description of Relativistic Interactions 


Therefore the energy and the 3-momentum of I" in & are: 


TE = myc? = 2? 


1 1 
Pp = ry ae 1lj= a 1 
f a m) 
0 0 
The four-momentum of I in & is: 
2c 
EE /é a 
Pr — — ve 
uP /» . 
x 


We check the results by showing that pf pra = —mrc’. Indeed: 


a 2 cee 2 
PP Pra = BOE oh ge 3c". 


3. To find the angle of the space directions of motion of the photons A“, B® in the 
proper frame of I’ we consider the inner product: 


1 
hap(T) A‘ B? _ (ns a saPtaPrs) A“B? 


1 
= A°B, + — (praA*) (prsB’) 


3c? 
1 1 
pig! Bio 2 
Taeae | TT 
“3 Te 


But: 


hap(T)A“B? = |$pliepl cos O48, 


The |4p| = |2p| = 1 therefore 048,5 = cos '{3(-2 + yp! 


17.6 The Relativistic System A“ + BY + C% + D* 631 
17.6 The Relativistic System A* + B¢ — C% + D*% 


Let M be the length of the common Center System four-vector of the pairs (A%, B®), 
(C%, D®). From (17.25) we have: 


— M? =—A? — B* +.2(AB) = —C’ — D? + 2(CD) (17.48) 


where (AB) = A“ Bg, (CD) = C“Dg. We assume that the lengths A, B, C, D of 
the particle four-vectors are given (e.g. they are the masses of the corresponding 
particles) and also that we are given enough data to compute one of the inner 
products (AB) or (CD). We shall show that with these data we can compute the 
remaining quantities involved. 

In order to do that we consider the decomposition of the four-vectors: 


(a) In the lab, which we assume that it coincides with the proper frame of the (non- 
null!) particle four-vector B® and 
(b) In the CS. 


Before we continue our discussion we recall that if we are given the four-vectors 
of one pair e.g. the pair (A“, B“) ina LCF © then the £* factor and the y* factor of 
the CS in & are given by the relations: 


Ay +B Ad + BY 
Ce ga = (17.49) 
Ay + Bs M 
In case & is the proper frame of B“ (=lab) then these relations read: 
AL AOL 4. B 
x * 
BL = ALO + B’ 4G = ye (17.50) 


In the calculations will be useful to write relations (17.49) and (17.50) in 
covariant form, that is in terms of tensor quantities. If & is determined by the unit 
timelike four-vector s“ (s“sqg = —1) which is the unit in the direction of the center 
four-vector A? + B?, then (17.49) is written: 


Vhap(s)(A? + B?) 


—s,(A® + B®) 


sp(A® + B?) 


: = 17.51 
Vs uM ( ) 


By = 


In the lab these relations give: 


A hgy(B)(A4 + B4)(A + BY) zg A(M?, A?, B’) + _ (M?, A?, B?) ~ 
7 —1B, (Ae + Be) ~ —4(ApBo— B2) M?— A? + B? 


(17.52) 


Bi 


632 17 Geometric Description of Relativistic Interactions 


By(A? +B’) — —B,A°+ B? | M*?— A? + B? 
BM i BM a 2BM 


YL = (17.53) 


where A is the unit of the spatial direction of A“ in the proper frame of B“. 
From relations (17.44) and (17.45) we have the following decompositions of the 
four-vectors in the lab (X“) and in the CS (=*).* 


M2+A2—B?2 M2~— A2~— B2 
? a) “2B 
At = = (17.54) 
1 2 42 pra 1 2 42 pra 
ay (M?, A”, B?)A* } agh(M?, A, B?YAz } ,, 
Bi = = (17.55) 
1 2 42 pra 
— xy (M", A?, B?)A* ] 0) 5 
M2+C?—pD? _ (CB) 
ce si : (17.56) 
sM(M?,C?, DC} \ng(B)CP) 5 
M?—C?+D? _ (DB) 
D* a : (17.57) 
—si7h(M?, C?, D?)C* } h@(B)D? } ., 


In order to calculate the zeroth component (i.e. (C B)) of the four-vector C% in 
the lab, we note that the inner product is invariant, therefore we can compute it in 
any frame we wish. We choose the CS &* where we know the components of the 
four-vectors. We have: 


(CB) = M2 — A? 4 B’) (? 4C2- p*) 


“aapll 
+A2(M?, A”, B2)A(M?, C2, D?)(A* - C*)]. (17.58) 


Similarly for the four-vector D“ we have: 


1 
DB) = --y|(M? - a? + B*) (Mm? — Cc? + D*) 
(DB) ami! 
4The “=” does not mean that we can equate the corresponding components of the four-vectors 


because the decompositions/componets refer to different coordinate frames. It simply indicates 
that they refer to the decomposition of the same four-vector in different LCF. The components of 
each vector are related to the other via the Lorentz transformation which relates the corresponding 
LCF frames. 


17.6 The Relativistic System A% + BY + C% + D* 633 


— 2(M?, A”, B?)A(M2, C?, D?)(A* - C*)). (17.59) 


We conclude that in order to determine the zeroth components of the “daughter” 
four-vectors C“, D® in lab we need to know the angle between A*, C* in the CS 
x. 

In the following we compute the various angles which enter in the geometry of 
the interaction. 


a. Computation of the angle a p between the spatial parts of the four-vectors 
C“, D® in the lab frame. 
We have: 


1 
Nap(B)C* D? = (1 + zz BaBr) c4D? 


=(CD)+ J (CB)(DB) _ M*? +C? + D’)4 i (CB)(DB) 
= B2 = 5) T T T B2 


= |C*||D“| cos Ep > 


dent I 
cos 0p = Canna 5 M* + C? + D*)4 picHine| (17.60) 


where: 


B2 


|C"| = Vhap(B)C*C® = |-e + ce | (17.61) 


Di = ViwiByDeD = |[—ve+ iow]. ans 


b. Computation of the angle 6 Ve between the spatial part of the four-vectors C“ and 
A® in the lab 
The spatial part of the four-vectors A“, C“ in the lab is given by the relations 
hap(B)A” , hea(B)C¢ respectively. Therefore the angle cos ae of the spatial 
parts in the lab is: 


|C4||A“| cos 04¢ = Nap(B)APh®(B)C4 = hap(B)A°C?. (17.63) 
The term: 
1 
hap(B)C“A® = (CA) + Bo ABCB) 


M4 A? 4 B? 
= (CA) + ——F5 CB). 


634 17 Geometric Description of Relativistic Interactions 


In order to compute the inner product (CA) we use the conservation equation 
A? + B® = C*% + D* which we multiply with C®% and get: 


(AC) + (CB) = —C? + (CD) > 


1 
(AC) = —(CB) ome WAC LD 
er) 2 2 
= (GBM + C* — D*). (17.64) 
Replacing we find: 


1 1 
ayb__+ 2 De 72 2_ 42 2 
hap(B)CC A’ = 5 | +C D*) += (m A +B’) (CB) : 
(17.65) 


The term |C“| has been calculated in (17.61) and the term |A“| in (17.43). 
Introducing the above results in (17.63) we find: 


B2 
A(M2, A2, B2),/— B2C2 + (CB) 


L _ 
cosd¢, = 


Gi 4.¢7= D\4 a (@? AP B?) cB) : (17.66) 


c. Computation of the angle 6%. of the spatial part of the four-vector C“ with the 
spatial part of the four-vector A“ in the CS. 
We note that normal to the direction of 67 the Cz does not change. The 
condition for this is: 


Cr x By =C* x pF (17.67) 
which gives: 


IC, | sindk. = |C*|sind4. > 
JAMZ, C2, D2 
—— sinOic = ( ) sinO4¢ (17.68) 
L 


2M,/—C? + 4(CB) 


where we have made use of (17.61). 


With the calculation of the angle sin64, we have completed the various 
quantities concerning the interaction A ++ B —> C+D. 

Obviously it is necessary that we organize the above results in order to make clear 
their internal coherence and, most important, to make them usable in practice. 


17.6 The Relativistic System A% + BY + C% + D* 635 


A. Data 
We take as data the quantities A, B, C, D, Ae A, which practically means 
the masses of the particles A, B, C, D the energy E P of A in the lab (= proper 
frame of B) and the direction of motion of particle A in the lab. 
B. Computed quantities 


1. The mass M of the Center of momentum particle: 
M? = A? + B? + 2BE%. (17.69) 
2. The A functions of the mother and the doughier particles: 


AMP A, BB?) = Mt 4A? A B* = M7 AP 9 M2 B* = DA? BP 


(17.70) 
97(M? CC? D7) = MP Ct 4 D* = M70? = 3M? Dp? = 302 D* 
(17.71) 
3. The factors £7, yj of the CS in the lab system: 

go OME, AS By 

P= yt Aly B oe 
M? — A? + B? 

ea 17.73 
YL >BM ( ) 

(M7, A?, B?)~ 
* Be — A 17.74 
yi Br IMB ( ) 


The knowledge of these quantities fixes the Lorentz transformation between &* 
pe 


ee] 

rear*+ - (B% -r*) — it Bx (17.75) 
L 

It = yh(* — Bi -r*). (17.76) 


4. The energies of the mother and the daughter particles in the CS &*: 


Ao%* — BX = M? + A — B? (17.77) 
“ 2M : 
M? — A? + B? 
B& = E% = ——_____ 17.78 
B OM ( ) 
M2 4.0*— Dp? 
c*% =F = x ls Sea (17.79) 


2M 


636 17 Geometric Description of Relativistic Interactions 


M2 — C2 + D? 


= 2M 


(17.80) 


5. The lengths of the 3-momenta of the mother and the daughter particles in the 
CS X*: 


1 

A*| = |B*| = ——a(M2, A”, B? 17.81 

|A*| = |B*| aT, ( i ( ) 
1 

C*| = |D*| = —a(mM2, C2, D?). 17.82 

|C*| = |D*| Sa ( ) ( ) 


It is not possible to compute any additional quantities because none of the 
four-vectors C“, D@ is completely known either in X* or in D“. It is required 
an additional datum and as such we consider the angle between the spatial 
directions A’, C* in the CS, that is we assume we know the (Euclidean) inner 
product A* . C*. With this new datum we compute the following quantities: 

6. The invariants (CB), (DB), (CA), (DA): 


(CB) = a [(w2 — a? +B) (m2 +c? - D°)] 
—_ ap [acw, A’, B*)A(M?, Ce D?)(A* . | (17.83) 
(DB) = —(CB) — : [we +B A] (17.84) 
I 2 2 2: 
(CA) = —(CB) — 5(M +C*— D*) (17.85) 


(DA) = —(CB) + ; [we + B? A] af sur C?+D*). (17.86) 


7. The energies of the daughter particles in the lab =: 


1 

EE = — (CB) [(CB) < 0 because Eé > 0] (17.87) 
1 

EL = —3(PB) — [(DB) < 0 because Ey > Ol. (17.88) 


8. The length of the 3-momentum of the daughter particles in the lab UF: 


1 
IC/| = j-¢? + cay (17.89) 


1 
|\D“| = |-»? + Psy. (17.90) 


17.6 The Relativistic System A“ + BY + C% + D* 637 


1 


9. The angle 0& p between the direction of motion of the daughter particles in the 
lab DF: 
L 1 1 2 2 2 1 
cos Cp = oan yal Pas + C* + D*)+ yo CEB) : (17.91) 


0. The angle ae of the 3-momentum C/ of the daughter particle C and the 
direction of motion of the mother particle A in the lab D# : 


oh = Z 
moewac * TCL (M2, A2, B2) 
1 
G BCP yep 2 (w? A: B?) cay]. (17.92) 
c* (M2, C?, D? 
sinoko = me sinO3c = ne sin O%c. (17.93) 


The above general relations can be used directly in an algebraic computing pro- 


gramme to develop software which would solve automatically collision problems 
provided the correct data have been introduced. A hint on how this can be done we 


& 


ive in the examples below. 


Example 17.6.1 An electron with four-momentum A® interacts with a positron 


10) 


f four-momentum B@ producing two photons with four-momenta C% and D@ 


respectively. Given that a. The positron rests in the laboratory and b. The electron 
moves in the laboratory along the direction specified by the unit vector e4, with 
speed factor y4 compute: 


BRWN Re 


6. 


S 


. The mass M of the center of momentum particle 

. The factor B7 of the CS in the lab system LAB 

. The energy of the particles and the measure of their 3-momenta in the CS 

. The energy of the photons and the measure of their 3-momenta in the lab if the 
angle of direction of the photon with the direction of motion of the electron in 
the CS is 040 = 5 

. The angle One of the direction of motion of the photon with that of the electron 
in the lab 


The angle between the direction of motion of the photons in the lab 


olution 
From the data of the problem we have: A = B =m, C = D = O. Furthermore 


it is given that the energy of the photon A in the lab is (c = 1): 


EL = A» = mya, 


638 17 Geometric Description of Relativistic Interactions 


and its direction A = e,. Finally it is given that in the CS the direction of 
propagation of the emitted photon is perpendicular to the direction of motion of 
the electron, therefore A* - C* = cos 5 = 0. We these data we compute directly 
from the previous formulae the required quantities. 


1. From (17.69) we find M : 
M = my/2(1 + ya). 
2. From (17.70) and (17.71) we find the 4 function: 
0?(M?, A*, B?) = M* + 2m4 — 4M? im? — 2m4 = M?(M? — 4m”) 
.°(M?, C?, D?) = M*. 


3. From (17.72), (17.73), and (17.74) we compute the velocity factors of the CS 
in the lab: 


fi eee : 4(ZY =3 a 1 sue), 
a) M2 ~ OV M/) 2 2i+ya) yati * 
M?—m> +m? _ 1M _ 1l+ya4 

2mM 2m 2° 


* 


YL = 


4. From (17.77), (17.78), (17.79), and (17.80) we compute the energy of the 
particles in the CS: 


en. , MW+m?—m M (1+ya 
ri ca) cn iris 


M2 M (l+ya 
Dg Ge 


5. From (17.81) we have for the length of the 3-momentum of the electron and the 
positron in the CS: 


1 M\? 7 | 
Meise Be 2 ) ada 
2M 2V\in 2 


and from relation (17.82): 


17.6 The Relativistic System A* + BY + C% + D* 639 


Because C“, D® are null vectors the energy equals the measure of the 3- 

momenta, that is |C*| = |D*| = m,/ a 

6. In order to compute the energy and the measure of the 3-momenta of the 
daughter particles in the lab we have to compute the inner product of the four- 
vectors involved. Equations (17.83) and (17.84) give for the inner products 
(CB), (DB): 


(CB) = ~(DB) =~ [M?M?] =—4 Z 
7 4M? ae | 
and equations (17.85) and (17.92) give for the inner products (AC), (AD): 


M2 M? M2 
CA) = = 
cA) 4 2 4 


(DA) = ++ 5 


From relations (17.87) and (17.88) we compute: 


1—-M? l14+y, 
L L 
Because C“, D® are null yal = ipl =m yA 


7. Using (17.89) and (17.90) we compute the magnitude of the 3-momentum of 
the daughter particles in the lab: 


M2 
|c*| = |D*| = _. 
4m 


8. Relation (17.91) gives the angle between the daughter particles in the lab: 


Lo 
cos OE p = 


16m? M* iM 8  ya-3 
M4 2  m? 16 7 


9. From (17.92) we compute the angle 0%: 


cosO4c = 


| 

rr | 
e 
+ 


640 17 Geometric Description of Relativistic Interactions 


10. Finally from (17.93) we compute the angle eee 


—: M 2 

Gs af a= ieee 
M— ‘A 
4m 


Example 17.6.2 (Compton scattering) A photon is scattered by an electron which 
rests in the laboratory. If the energy of the scattered photon in the laboratory is E4 
and the scattering angle (in the laboratory) is 9” calculate the energy of the scattered 
photon in the laboratory.> 
Solution 

The reaction is: 


yte —yte- 


Considering the reaction A + B —> C + D we identify the following data for the 
present problem: 


A=C=0, B=D=m, EX = Eg, 04, = 0". 
From relation (17.69) we compute: 
M* = m* +2mEg. (17.94) 
For the triangle function A(M 20, m) we have from (17.30): 
0?(M?, 0, m2) = (—M? +m’). 


Replacing the data in (17.92) we end up with one equation with unknown the inner 
product (CB): 


m2 [we —m?+ 4(M? + m)(CB)| 
(M2 — m?)(CB) 


coso’ = 


We solve in terms of (C B) and find: 


(M2 — m2)m? 


CB) = : 
cf) M2 + m2 — (M2 — m2) cos 64 


>For the standard treatment see Example 10.7.3. 


17.6 The Relativistic System A* + BY + C% + D* 641 


Replacing M? from (17.94) we find finally: 


mE, 


CB)= . 
ee) m+ E,(1—cos6#) 


(17.95) 


Having computed (C B) we replace in the general relations and calculate the rest of 
the elements of the reaction. For example the energy of C in the laboratory is: 


(CB) = Eqm 
m  m+(1—cos@“)E," 


EE = (17.96) 


If the data of the problem are different then we work in a similar manner (see 
Example 17.6.3). 


Exercise 17.6.1 In Example 17.6.2 consider as given the energies E,4, Ec in the 
lab system of the falling and the scattered photon and prove that the angle of 
scattering re is given by the expression: 


ob 1 1 1 
sin? 4€ = —m . (17.97) 
2 2 Ec Ea 
Hint: In the expression Ec = —CB) replace (CB) from (17.95) and solve for 


8 
cos Okc. 


Example 17.6.3 Study the reaction A + B —>» C + D considering as data the 
lengths A, B, C, D of the particle four-vectors and the scattering angles 0 ee oF D: 
Apply your results to the case of final state of two photons. 
Solution 

Normal to the direction of motion of the particle A we have: 


Ly an gk Ly i gh 1, SiN@Kc aL 
|C™| sin @4¢ = |D"| sin@4p = |D"| = —>—|C"|. (17.98) 
sin 04 p 
The inner products give: 
AL. CE = |A*||C4| cos ok 
A’. DE = |A“||D“| cos 0%p 


AL. (CE 4 D4) =|Al4 [ic cos 64. + [DE| cos kp | 


642 17 Geometric Description of Relativistic Interactions 


From the conservation of four-momentum we have AY = C/ + D/. Replacing we 
find: 


sin(O4. + OX) 


L 
|A|* = 
“AL 
sin 04 p 


IC/}. (17.99) 


We conclude that we can express the measure of the spatial parts of the four- 
momenta A“, D“ in terms of the spatial part of the four-momentum C*. 
Conservation of four-momentum gives for the zeroth component: 
EL4+B=EE+ ES. (17.100) 
But for the particle A : 
2 Ly2 ae Ly2 2 ty 
A? = —(Eh) + (lal) = (ep? = 47+ (1al) 


sin*(04. + Ok) 


AP a. cL 
sin? O45 Ic 
and for the particles C, D: 
Ly2 2 ae 
(Eb)? = C? + (Icl*) 
2 sin? 94 
(Eb) = D? + (pI) = D? + —4¢\c". 
sin® 04 p 


Replacing in (17.100) we find the equation: 


sin“ (04- +6 sin? 64 
Jes ae AD) CL + B= y C2 + (|c|*) Fe Jor ge 


“G17.101) 


which we solve and determine |C“|. From (17.98) we determine |D“|. Then 
from (17.89): 


(CB) 
lean 24 = 
we determine the inner product (CB) and from (17.90) the inner product (DB). 
Then equations (17.87) and (17.88) give the energies EL, E‘, of the daughter 
particles in the laboratory. Finally from (17.91) we compute M? and consequently 
any other quantity we wish. Obviously the calculations are involved and this 
indicates the usefulness of the covariant study of the relativistic collisions which 


17.6 The Relativistic System A% + BY + C% + D* 643 


makes possible the solution of a problem with the use of algebraic computing 
programmes. 
Application. 

Final state of two photons means that the daughter particles are photons. This 
implies C = D = 0 and from the above analysis we find: 


|C|’ = EG, 

sin ot 
Ep = |DI’ = —4£ Ee 

sini 

2 7AL L 

sin* (04%- + O4p) 2 
Ek =,/A* + —40_*”' (Ee). 

sin’ O4'p 


Replacing in (17.101) we find an equation with sole unknown the energy E Be 


2 
sin?(04,. + oF sin 9L not 
Oxc AD) fie AC (E by 428 in” ae ee B?-0 
sin 


“2 aL “gL 
sin® 04 p sin64n AD 


Solving we determine E f and from this all the quantities of the reaction. 

The above solution is crude. We present a second solution in accordance to the 
previous considerations. We replace the data in (17.92) and find an equation with 
unknowns the quantities (CB) and M i 


B2 [we + Fw? — A? 4 BV(CB)| 


6k — 17.102 
rae x(M2, A2, B2)(CB) ( ) 
We solve in terms of (CB): 
M? B? 
(CB) = (17.103) 


M? — A? + B2 + 4(M?, A?, B?) cos 0& 


If we replace C with D in the above relation we find without any further 
calculations: 


M? B? 
DB) = 17.104 
(D8) M? — A? + B?+A(M?, A”, B?) cos ak ( 
From the conservation equation we have if we contract with B® : 
; —M? + A? — B? 
(AB) — BY = (CB)+(DB) = ———— = (CB)+ (DB). (17.105) 


2 


644 17 Geometric Description of Relativistic Interactions 


Replacing in this last equation the inner products (CD), (DB) from equa- 
tions (17.103) and (17.104) we end up with one equation which contains only 
M?. 


17.6.1 The Reaction B —> C+D 


The reaction B —> C+ Disaspecial case of A+ B —> C+ Dif A “disappears”. 
This means two conditions: 


a. A® =O and 

b. The energies Ey =E a of the daughter particles in the “proper” frame of A 
vanish 
We conclude that in order to study reactions of the form B —> C + D we 
have simply to use the general formulae we have derived and consider as data 
A=0,2=25 =? =, 


Condition: Ef = —7(AB) =0> M= B. 
Condition: BS — —#%(AC) = 0 > (AC) = 0. Then from (17.83) follows that: 


1 1 
(CB) = —755(B? + B°)(B’ — C? + D*) = —5(B? —C + D”). 
Working similarly we see that condition E 7 implies: 
1 
(DB) = —5(B° + C* — D”). 


Concerning the energies of the daughter particles in the laboratory from rela- 
tions (17.85) and (17.86) we have: 


1 
Bo = (fF? +c? =p? 
C ap + ) 
1 
EL = —(B?-—C?+4+ D?). 
C aR + D*) 


From relations (17.89) and (17.90) we have that the magnitude of the 3- 
momentum of the daughter particles is the same and equal to: 


1 
(C“| = |D*| = sg VB, C?, D?), 


17.6 The Relativistic System A“ + BY + C% + D* 645 


We conclude that the reactions 1+ 2 — 3 and 1 — 2+ 3 are completely 
characterized by five parameters. One set of such parameters is the three masses of 
the particles and the angles of motion of one particle in the proper frame of some 
other. Obviously there are other sets of five parameters. 

To check the above we replace the above results in (17.91) and find cos ae D= 


—1, hence a _p = 7 as expected. 


Chapter 18 ®) 
Waves in Special Relativity on 


18.1 Introduction 


The electromagnetic waves, which are inevitably studied in Special Relativity, 
do not exhaust all types of waves. Indeed, as it has been remarked already in 
many cases, Special Relativity is a theory of all physical phenomena including 
electromagnetism. Therefore within Special Relativity one must be able to deal with 
e.g. thermal waves, acoustic waves etc. In this section we discuss the generic concept 
of a wave in Special Relativity. Waves are rather difficult to understand properly, 
even in Newtonian Physics where one has a direct sensory observation of space, time 
and motion. Moreover even there when the subject is approached at a slightly higher 
level demands rather heavy mathematical formalism and one gradually creates the 
opinion that somewhere Physics is lost and mathematics prevail. The situation is 
even worse in Special Relativity where one is obliged to work with a geometric 
concept of motion right from the beginning. Therefore in order to study the waves in 
Special Relativity it is best to reformulate the Newtonian wave theory in a geometric 
rather than in the standard formalism. The subject of relativistic waves is vast and 
in this book we aim to deal only with the basic elements of it, in fact we shall take 
the subject up to the point that one can apply properly optics and understand the 
deBroglie waves, which are required for a proper understanding of Schroéndinger 
equation and consequently of Quantum Mechanics. 


18.2. The Disturbance 


During the course of time the word ‘wave’ has been used in various contexts and 
meanings which have resulted in a confusion of what it is meant by a ‘wave’. 
Therefore we feel that we should attempt a definition of wave in terms of the 


© Springer Nature Switzerland AG 2019 647 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_18 


648 18 Waves in Special Relativity 


concept of disturbance, a concept which is easily understood and has not been used 
extensively. 

Let us start with an example. Consider a perfectly elastic spring, which is tight 
at rest on a smooth horizontal table with both its ends fixed. An observer O is 
possible to describe the state of the spring by means of phrases such as ‘the (linear) 
density of the spring is constant’, ‘the relative velocity of adjacent particles of the 
spring is zero’ etc. Mathematically these phases are formulated as follows. Let r 
be the position vector of an arbitrary point along the string in the frame of O. The 
first phrase means that there exists a real valued continuous function (continuous 
because the string is assumed to have no kinks) p(r)=constant, the second that the 
exists a vector valued function v(r) = 0 etc. The density and the relative velocity do 
not exhaust the physical quantities describing the state of the spring. For example 
one could consider the temperature along the string. The tensor fields which one can 
associate with the state of the spring are not independent, but they are related with 
the laws of Physics and/or additional simplifying or other type assumptions. The 
Principle of Relativity in Newtonian and in Special Relativity requires that: 

All physical quantities which describe the state of a physical system are described 
mathematically by a tensor field. In Newtonian Physics these tensors are Newtonian 
tensors and in Special Relativity Lorentz tensors. 

It is easy to accept that in the state of ‘restness’ all physical quantities of the 
spring will be described by equations of the form ¢;(r)=constant where J is a 
collective index specifying the tensorial character of the field ¢; (r). 

We come now to the concept of a disturbance. 


Definition 18.2.1 When an outside effect is applied on a physical system which 
causes a change of the state of the system (not necessarily its motion) then we say 
that in the physical system propagates a disturbance. The disturbance is quantified 
by means of various physical quantities which are described mathematically by 
tensor fields ¢;(t,r) of appropriate type where J is a collective tensor index. 


Let us apply Definition 18.2.1 to the spring discussed above. Suppose that while 
the spring rests on the table a pulse is applied at one of its ends. Then the observer 
O notes that the state of the spring changes in the sense that the various tensor 
fields which are describing the state of the spring and where constants, now they 
are varying with time and along the spring. For example the mass density of the 
spring at the point r the time moment ¢ of O is p(t}, r), the time moment fo is 
(t2,¥) and so on. The same applies to other physical fields of the spring (velocity 
field, temperature field etc.). We conclude that, in general, during the propagation of 
a disturbance many physical fields are activated. For ‘smooth’ (that is not for very 
‘intense’ and ‘sudden’) disturbances these fields are expected to be equivalent for 
the description of the state of the spring, hence of the disturbance. For example if 
one knows the density of the spring p(t, r) then one is able to calculate the velocity 
field along the spring by making use of the laws of Physics and the structure of the 
spring (perfect elasticity etc.). This observation is general and applies to all physical 
properties of the spring and physical systems in general. 


18.3 Waves in Newtonian Physics 649 


The mathematical definition of a disturbance as a tensor field (or a set of tensor 
fields) is very general and incorporates all possible notions which can be attributed 
to the concept of wave. Furthermore dissociates the concept of the wave from direct 
sensory images such as a waving rope, a spring, a column of gas in a closed tube 
etc. 


18.3 Waves in Newtonian Physics 


In Newtonian Physics the waves are special disturbances which are defined as 
follows. 


Definition 18.3.1 Consider a disturbance which is described by the Euclidian 
tensor field @;(t, r). We shall say that the disturbance is a classical wave if in the 
3-space E 3 the tensor field ¢1(t,r) which describes the disturbance is of the form: 


br(t.r) = Ay(t, relkn hw (18.1) 


where A;(t,r) is another Euclidian tensor field of the same type as ¢;(t,r) called 
the amplitude of the wave and k,, (x — w'r) is a dimensionless Euclidian invariant 
called the phase of the wave. The vector k” is called the wave vector of the wave 
and the vector w“ the phase velocity of the wave. 


The waves are classified according to the nature of the tensorial character of the 
index / in scalar, vector etc. waves. For example a density wave is a scalar wave and 
an electromagnetic wave is a (0, 2) tensor wave. If the phase is independent of time 
the wave is called a standing wave, otherwise is called a traveling wave. 

A second classification of waves is according to their phase as follows. The phase 
of a Newtonian wave defines the 2-d surfaces: 


S(t,r) =k, (x — wt) = constant (18.2) 


which we call wavefronts. Waves are classified according to the shape of the 
wavefronts. For example if the wavefronts are planes/spheres/cylinders etc. the 
corresponding waves are called plane waves/spherical waves/cylindrical waves 
etc. respectively. 

The existence of waves depends on the structure of the physical system and the 
space where the disturbance propagates. For example the electromagnetic waves in 
Special Relativity concern a disturbance which propagates in empty space, whereas 
in Newtonian Physics these waves are considered as disturbances of the ether. 

The quantity w = k,,w" is called the cyclic frequency of the wave. In terms of 
the wave vector and the cyclic frequency the phase of the wave is written as follows: 


N(r, t) = k,x" — wt = constant. (18.3) 


650 18 Waves in Special Relativity 


Fig. 18.1 Definition of phase 
velocity 


The phase velocity has to do with the ‘motion’ of geometric surfaces (the 
wavefronts) in space and not with the motion of particles, therefore it is not restricted 
by the speed of light in vacuum. In order to define the phase velocity of a wave 
wrt an observer O we consider the equation S(t,r) = C (C =constant) which 
defines the wavefronts of the wave. Let S(t;,r) = C be the wavefront at time ft 
and S(t; +dt,,r) = C (same C!) the moment f; + dt. We define the phase velocity 
of the wave at the time f, at the arbitrary point P of the surface S(t;,r) = C, 
whose position vector is rp, by the quotient of the normal distance of the wavefronts 
S(t, r) = C and S(t; + dt,, r) = C at the point P divided by dt (see Fig. 18.1). 

Let us calculate the phase velocity w(t, r) of the Newtonian wave with wave- 
fronts S(t, r) = C. We have!: 


chy as as 
O0=dS = S(t +dt,r)-St,n=|—) -dr+— dt=|VSp-w+— dt 
or t \t Of |ty 


P at | 
(18.4) 
hence: 
as 
ot 
t,r)=— VS. 18.5 
wi = (18.5) 


This is the most general expression for the phase velocity of a classical wave. It 
is a Newtonian vector hence it is possible to describe a Newtonian physical quantity. 
We note that the phase velocity depends on the shape of the wavefronts and not on 
the tensor field (the amplitude) characterizing the wave. This is the reason we are 
able to classify waves as plane, spherical etc. independently of their nature, that is 
of the physical property they refer to. In the next subsection we determine the phase 
velocity of some standard types of waves. 

When a medium is homogeneous and isotropic the phase velocity is normal to 
the wavefronts. A line normal to the wavefronts, indicating the direction of motion 
of the wave, we call a ray. 


‘Because the geometry of the space is Euclidian the normal |VS| # 0. 


18.4 Plane Waves 651 


Fig. 18.2. Waveform along a 
string 


(a) (b) 


To illustrate the above concepts consider a long string stretched along the x-axis 
and produce at the end of the string at x = 0, a pulse as shown in Fig. 18.2a. Let 
t = 0 be the moment of creation of the pulse (=disturbance) and let y = g(x) be the 
shape of the string (the pulse) at t = 0. Experiment shows that as time passes the 
pulse travels along the string without changing form, provided frictional losses are 
negligible (ideal string). Mathematically this is expressed by the requirement: 


y(x, t) = g(x — wt) 


where w is the phase velocity of the pulse along the x—axis as shown in Fig. 18.2b 
(wave traveling to the right). If the pulse f(x) travels in opposite direction then 
w — —w and we have: 


y(x, t) = g(x + wt) 


(wave traveling to the left). In order to specify the wave further one has to define a 
specific function g. If we choose the general form: 


g(x — wt) = A(x) cos(x — wt) 


then we must have A(x) = g(x, t = 0). We see that the amplitude is independent of 
the phase and specifies the disturbance at t = 0. If we take g(x—wt) =A (x)elf@—wt) 
then we should keep only the real part of the term e!f*—¥, 

The value ¢7(r,t = t;) is called a snapshot or waveform of the wave at the 
moment f;. It is possible to visualize the propagation of the wave as a continuous 
sequence of snapshots (i.e. as in a movie film). 


18.4 Plane Waves 


These waves are characterized by the fact that their wavefronts are 2-d planes, with 
equation: 


Sqr,t)=k-r—at=C (18.6) 


652 18 Waves in Special Relativity 


where k is the unit normal to the planes anda ¢€ R is areal constant. If a = 0 then 
the wave is a standing plane wave and when a + 0 the wave is a traveling plane 
wave. We compute: 


da : 
vote, Ea 
or ot 
and (18.5) implies that the phase speed: 
w = ak. (18.7) 


Because a, k are constant we infer that the phase velocity of plane waves is 
constant. If we choose the x—axis along the direction of k then the equation of the 
wavefronts is written: 


x-at=C (18.8) 


and the wavefronts are planes parallel to the y, z plane which propagate (when 
a # 0) with ‘speed’ a (traveling plane wave) or they do not move (standing plane 
wave) (see Fig. 18.3). 

Examples of plane waves are many and well known. One such is the oscillations 
of a column of gas in a closed tube (where one has waves of pressure, density etc.). 
Another case is the longitudinal wave which propagates along a free metallic rod 
when it is hit at one of its ends with a metallic hammer. A well known example of a 
plane wave is the following. 


Example 18.4.1 Maxwell’s equation for the electric field propagating in a homoge- 
neous and isotropic medium is: 


V-E le 0 (18.9) 
w2 at2 , 
where w = wae is the speed of the wave in the medium and yp, e€ are the 


magnetic permeability and the dielectric constant of the medium respectively. We 
look for plane wave solutions of the form E = Ege~!“**-®) where k, w are constant 
quantities. We compute: 


07E 


— 2 


V-E =K’E, 


Fig. 18.3 Plane waves. The 
wavefronts are spaced a 


wavelength apart and the } } > 
arrows represent rays y > ray 


wavefront 


18.5 Spherical Waves 653 
Replacing in (18.9) we find that the quantities k, w must satisfy the relation: 
2 w\?2 o o 
Uu c w 


We observe that the solution depends on the magnitude of k, hence the solution will 
be: 


E — Eoe! (k-xtot) . 
We infer that the general solution of (18.9) is: 
E=Af(kK- x—ot) + Bg(k - x+af) 


that is, two vector plane traveling waves propagating along the directions +k. The 
wavefronts of the wave A f (k - x—at) are given by the equation: 


k-x-ot = kk-x—ot = k(k -X—ut) = constant. 


It is shown easily that for the planewave Bg(k-x-+ot) the phase velocity is 
w= -—u. 


18.5 Spherical Waves 


In these waves the wavefronts are spherical surfaces with equation: 

S(r,t) =e--(r—1r9) —at=C (18.10) 
where ro is the position vector of a point in space which we call the center of the 
spherical wave and a, C are constants. Choosing the center of the spherical wave 
as the origin of the coordinate system the wavefronts in spherical coordinates are 
given by (See Fig. 18.4): 

r—at=C (18.11) 


were r is the radial coordinate. We compute: 
VS=e,, ~—=-a 


hence the phase velocity of a spherical wave is: 


w =ae,. (18.12) 


654 18 Waves in Special Relativity 


Fig. 18.4 Spherical waves. 
The wavefronts are spaced a 
wavelength apart and the 
arrows represent rays 


ray 
_ 
wavefront 

— 

Fig. 18.5 Hyperbolic waves. raya 

The wavefronts are spaced Ywavetidnt_ 

one wavelength apart Serena ee 
wavefront 


If a > 0 the wave is called an outgoing spherical wave and when a < 0 an 
incoming spherical wave. We note that the plane and the spherical waves are one 
dimensional in the sense that in order to describe the wavefronts it is enough to 
specify one coordinate only (x and r respectively). 


Example 18.5.1 Calculate the phase velocity of a hyperbolic wave whose wave- 
fronts are defined by the equation: 


x -y -at =C (18.13) 


where a, C are constants. 
Solution 

In Fig. 18.5 the wavefronts are shown in the plane x, y (the z—axis is normal to 
the plane). The wavefronts are hyperbolic 2-d surfaces which are outgoing if a > 0, 
incoming if a < 0 and standing if a = 0.We compute: 


os 
VS = V(x" — y* —at) = 2xi- 2yj, >> 
hence: 
* _ (2xi — 2yj) ti yp 
=> X — x 
AG? + y) Ie 


18.6 Linear Superposition of Waves 655 


Working in the same way one computes the phase velocity of an arbitrary 
Newtonian wave whose wavefronts are known. 


Example 18.5.2 Consider a disturbance in the pressure which propagates in a fluid 
according to the expression p(x, ft) = po+ b f (kr +cat) where po, b are constants. 
This disturbance is a damped wave with: 


b 
A(r, t) = — 
: 


a) 
wk=o>w= pe 

The constant po plays no role, simply determines the state of equilibrium. These 

waves are spherical damped waves. 

18.6 Linear Superposition of Waves 


Consider the general wave: 


gr(r, t) = A(r, 1) f [ak - (r+ we)] 


where the wave vector k = a(r)k. The tensor A; (r, t) can be decomposed in a basis 
{€7(q)} of the space (@ is a collective index) as follows: 


Ar(r,t) = > A(t, tHer(a)- 


Qa 


This decomposition leads to: 


g1(t,t) = >> A@(r.) f [a@k- tw] er = Di den(r Nera): 
“ (18.14) 


This relation is understood as a linear superposition of waves in the sense that the 
“component‘ waves ¢, (r, tf) add up their wave motion (=disturbance) to produce the 
original wave $7 (r, tf). Working conversely we may add the same type of a number 
of waves to produce a new wave. The resulting wave can be quite different from the 
component waves. 


Example 18.6.1 Consider the parallel plane waves f; = fo sin(kx — wt) and fo = 
fo sin(k’x — wt). The composite wave is: 


1 1 
f=fth=2/focos 5 [(k’ — k)x — (@' — @)t] sin 5 [(k’ + k)x — (o' +0)r]. 


656 18 Waves in Special Relativity 


This is a new wave with the following characteristics: 


Ae, 1) = 2focos 5 [(k k)x — (o' — a)t] 


k= *(k! +k) 
a) 


1 
kw = 5 +0) > w= 


18.7 Period and Wavelength of a Wave 


Definition 18.7.1 We say that the wave ¢;(r,t) has a period T, or that it is a 
periodic wave with period T, wrt the Newtonian observer O if the following 
condition is satisfied: 


giv,t)=o0,t+T) (te R). (18.15) 


Because this equation is covariant a periodic Newtonian wave for one Newtonian 
observer is also periodic for all Newtonian observers, however with different period 
(due to the Doppler shift). If we replace in (18.15) the @;(r, t + T) from (18.14) it 
follows that the component waves ¢,(r, t) are also periodic with the same period 
T. An example of a periodic wave is the scalar field: 


. | Qn x 
(r,t) = sin Fac : rw) 
wT 


In a periodic wave with period T the quantity v = + is called the frequency of 


the wave. Another quantity associated with a periodic wave is the wavelength: 
A=uT 


where w is the phase speed. The wavelength is not an invariant (the frequency is 
because T is a Newtonian invariant) because the phase velocity w is not covariant. 
For this reason it is not a Newtonian physical quantity and its use should be restricted 
even in Newtonian theory. However it is used extensively in applications perhaps 
due the Newtonian concept of length. Needless to say that in Special Relativity the 
wavelength is rarely used for obvious reasons. 


18.8 Relativistic Waves 657 
18.8 Relativistic Waves 


In the previous chapters of this book we have studied the various concepts of 
Special Relativity staring form a corresponding concept of Newtonian Physics. Thus 
e.g. to the velocity/acceleration/force etc. we considered the four-velocity/four- 
acceleration/four-force etc. respectively. With the electromagnetic field we had no 
Newtonian analogue and we used Maxwell equations to introduce the electromag- 
netic field tensor Fay. Working in the same spirit we look for the analogue of the 
Newtonian waves. Because the phase is the characterizing element of a Newtonian 
wave we concentrate on it and define the relativistic wave. 


18.8.1 The Frequency Four-Vector 


As we have seen the quantity which characterizes the wave nature of a Newtonian 
wave is the phase: 


Na, t) =k-r—ot (18.16) 


where k is the wave vector and w the angular frequency of the wave. The phase 
defines the wavefronts of the wave: 


N(r, t) = constant (18.17) 


which is assumed to be a Newtonian invariant. 
The function N(r, t) can be written formally as follows: 


2, 
sa ? ~ ze (2) : ( ‘ ) 
ENE Ts gk] » 


where o stands for Lorentz product of four-vectors, & is the Newtonian inertial 
frame where the wave is considered and v = w/2z is the frequency of the wave. Of 
course this formal writing has no power in Special Relativity because the quantity 


5 : : ‘ ct 
N(r, t) is not a Lorentz invariant quantity. However we note that because ( ) are 
>») 


bat : ee v 
the components of the position four-vector in spacetime if we define that ( ‘ :) 
Ine 7 dD 


are the components of a four vector f” i.e.: 


fos ( z ) (18.18) 
ik) 5 


in the frame where we study the wave, then automatically the phase N(r, t)becomes 
a relativistic invariant hence it stands as a relativistic physical quantity. We assume 


658 18 Waves in Special Relativity 


that this is done and the new four-vector f“we call the frequency four-vector. For 
the time being we do not assume if it is a timelike or a spacelike four-vector. 

The frequency four-vector is a potential relativistic physical quantity and will 
become a physical relativistic quantity only after we identified it with a correspond- 
ing Newtonian quantity in a specific frame or, if such a quantity does not exit, 
we postulate it to be a pure relativistic quantity (as we do for example with c). 
We decide to identify the frequency four-vector with the Newtonian wave in & 
whose frequency is v and the phase velocity w. The physical implications of this 
identification will be found when we consider the behavior of the frequency four- 
vector under the Lorentz transformation. 

The wavefronts of a relativistic wave are three-dimensional hypersurfaces in 
Minkowski space with equation N(r,t) = x% fg =constant. A plane relativistic 
wave is defined by the requirement N(x“) = f“(xg — cq) = 0 where the frequency 
four-vector f“ is assumed to be constant and c, is a constant four-vector. Obviously 
Ff“ is normal to the wavefronts. 

In analogy with Newtonian Physics, in Special Relativity a wave is defined by 
Lorentz tensor fields ; (x%) of the form: 


r(x") = A(x") F (x? fa) (18.19) 


where F (x“ f,) is a function of the (relativistic) phase x“ f,. A;(x%) is a Lorentz 
tensor of the same type as @; called the amplitude of the (relativistic) wave. 

Consider a RIO & in which the wave has wave vector k and frequency v so that 
the frequency four-vector in & is: 


fi= (i) (18.20) 
wK/» 


where k is the unit vector in the direction of propagation of the wave in X. Replacing 
N(r,t) in the place of S(r,t) in (18.5) we find the phase velocity w of the 
relativistic wave in & to be the 3-vector: 


w=——k. (18.21) 


Exercise 18.8.1 Show that the frequency and the wave speed of a wave with 
frequency four-vector f@ in & are given by the covariant expressions: 


1 
v=—- fu (18.22) 
Cc 
2 
w? = ae (18.23) 
je 


(lu) 


where u® is the four-velocity” of &. 


2Note the difference between w and u“! 


18.8 Relativistic Waves 659 


The four frequency vector can be timelike / spacelike or null if the phase velocity 
is > c, = c, < c respectively. Indeed let us assume that the four-frequency for some 
wave is timelike. Then from (18.88) we have 


F 3 ce 
f"fa=—-v+—, 50>c<w (18.24) 
W 


that is, the phase speed of the waves is greater than c. In the case of electromagnetic 
waves propagating in empty space we find w = c. This result means that the 
four-frequency vector of a photon (=electromagnetic wave) of frequency v which 
propagates in empty space along the 3-direction k ina RIO Lis: 


ee f 
f =v(j) - (18.25) 


These results we shall use in later chapter to define the correspondence between the 
wave and the particle nature of photons. 


Example 18.8.1 The RIO O observes the one dimensional wave x = 
acos 27v (= = t) where w > c is the phase velocity of the wave. 


1. Calculate the frequency four-vector 

2. Calculate the phase velocity of the wave wrt O 

3. Another RIO O’ is related to O in the standard way with speed s. Calculate the 
frequency and the phase velocity of the wave wrt O’. 


Solution 


1. The phase of the wave is written: 


anv (= r) = zl (v, ~) - (ct, x) 


hence the frequency four-vector for O is: 


f= (v. a 0) 5 


We compute the length 
22 2 
f frat = ei =) 
w 


Because w > c we have that the four vector f” is a timelike vector. 


660 18 Waves in Special Relativity 


2. The phase velocity of the wave wrt O is w = wx (w > c) where (see (18.23)): 


Ph 
1+ Grn) 


and u’ is the four-velocity of O. In the proper frame of O (where the wave is 
observed) the four velocity u, = (c, 0) hence: 


as expected. 


p’ 


f: 
3. Let f? = fe be the frequency four-vector for the observer O’. The boost 


y 
tet 
relating O, O’ gives: 
s Ys VC sw 
fu =s(fe - =v) = -* (1-5) 
Sy = ty =0 
tu = fz =0 


v= — =f) = vv (1- =). 


Ss 


Relation v’ = ysv (1 a a) is the relativistic Doppler shift (to be discussed). 
The difference from the Newtonian Doppler shift is the appearance of the y —factor 
which counts for the time dilation effect. 

The phase velocity w’ of the wave wrt O’ is computed from the relation: 


18.8 Relativistic Waves 661 


Replacing f" f, = —v* (1 - Ss) and v’ = y,v (1 — +) we find?: 


A quicker method to determine the phase speed w’ is to use the invariance of 
f' f; and write: 


verve 


w w 


Replacing the various quantities we find the same result for the phase velocity w’. 
Relation v! = yv (1 _ “\ holds for 0 < w < +00. We note that forw =u <c 
the v’ = 0. In this case the RIO O’ moves with the wavefront and we say that the 
observer is ‘frozen’ in the wave. We also note that for u > w the v’ < 0 which does 
not make sense and this relation does not hold. This case is met e.g. in sound waves 


when the observer moves with supersonic speed. 


18.8.2 The Doppler Shift 


The Doppler shift concerns the relation of the frequencies of a given wave as seen 
by two different RIOs. To derive this formula we note that the four-velocity of a 
RIO & in its proper frame is: 


therefore the frequency v is given by the invariant (see also (18.92)): 
foUg = —Vve. (18.26) 


Consider two observers %1, X22 whose four-velocities in & are as follows: 
vi = ( mae ) , ve ( ae ) (18.27) 
YIVI/S > y2V2) > 


3This result is different form the composition of 3-velocities we met before, that is, the phase 
velocity does not behave as the 3-velocity of a particle. This is due to the fact that the phase 
velocity is a ‘geometric velocity’ concerning surfaces in spacetime and not particles or other 
physical systems. For this reason the phase speed a. is not restricted by the speed of light in vacuum 
and b. does not obey the relativistic rule of composition of 3-velocities. 


662 18 Waves in Special Relativity 


and assume that these observers measure for the wave with frequency four-vector 
f% the frequencies v;, v2 respectively. Obviously v1 4 v2 because the frequency 
is an observer dependent quantity. The Doppler formula in covariant form is simply 
the quotient: 


V1 _ favt 
v2 jet 


(18.28) 


We emphasize that this formula holds for all types of waves electromagnetic, sound 
waves etc. In order to write it in a form similar to the standard one we compute the 
inner products using (18.18) and (18.27) and find*: 


uM —1l+viggk | vi 1-+(y-k) 
Mm -lt+woak 7 1—1(-k) 


(18.29) 


where v is the frequency and w is the phase speed of the wave in &. In the case of 
an electromagnetic wave w = c and this formula becomes: 


Vp yi 1-11 -k) 
v2 -¥21—4 (v2 -k) 


(18.30) 


which is the familiar formula. 

We assume now that X22 coincides with %, so that the frequency of the wave in 
2X is v2 = v and v2 = 0. In this case (18.29) gives that the frequency of the wave in 
1 is given by the formula: 


1 &s 
w=" (: ~ —(y +¥)) v= (1 mes cos 4) v. (18.31) 
W W 


where @ is the angle in & between the direction of motion of X; in & and the 
direction of propagation of the wave in X. 


Exercise 18.8.2 Derive (18.31) using the Lorentz transformation relating X, X). 


We consider two extreme cases of (18.31). 


4Unfortunately the role of the Doppler effect frequently is not properly understood and people 
approach it in a different way for electromagnetic and non-electromagnetic waves. See for example 
R. Bachman ‘Relativistic acoustic Doppler effect‘ (1982) Am. J. Phys. 50, 816 and R. Bachman 
‘Relativistic acoustic Doppler effect in the optical limit‘ (1986) Am. J. Phys. 54, 848. 


18.8 Relativistic Waves 663 


18.8.2.1 The Radial Doppler Shift 


This is the case 9 = 0, z for which (18.31) reduces to: 
=n (le—)y (18.32) 
w 


where the upper sign corresponds to the value 6 = 0 and the lower sign to the value 
6 = a. In the case of an electromagnetic wave w = c and formula (18.32) becomes 


/1+B 
y= fee (18.33) 


which coincides with the standard well known Doppler formula for light waves. It 
follows that when & (the observer) ‘approaches’ & (the source) (case @ = zr) the 
frequency vj < v whereas in the opposite case (case 9 = 0) the frequency v, > v. 
In the first case (for obvious reasons) we say that the wave frequency is red shifted 
and in the second case that it is blue shifted respectively. We cannot reach the same 
conclusion for a general wave because the ratio ot can be larger than 1, therefore 
depending on the phase speed and the speed of the observer it is possible that vj > 
vorvyy <v, 


Example 18.8.2 In order to asses the validity of the relativistic radial Doppler 
formula over the Newtonian the following experiment has been suggested. A source 
emits in its proper frame Xt monochromatic plane waves of frequency vo. In 
another frame X, (resp. 2) which moves wrt XT in the direction k (resp. —k) 
normal to the wavefronts k with speed uy = uk (resp. U2 = —uk) the frequency of 
the (plane) waves is vj (resp. v2). Is vy} = v2? What happens in the Newtonian case? 
Apply the result to sound waves whose phase speed in air is 317 m/s. What the 
conclusion would be for light waves? 

Solution 

The Doppler shift formula gives for each observer: 


V1 u 

ao Se 
Vo WwW 

v2 = 1 1 

vy yl-+ 

Therefore: 
voy 1 u> voy Te u2 
wW-Vy= i= E (: =) = — i E a (18.34) 


that is vy; ~ v2! The result is surprising because it is not symmetric wrt the 
change of the direction of the velocity thus violating the basic principle of Special 


664 18 Waves in Special Relativity 


Relativity concerning the relativity of motion. However this is not the case because 
the frequency is the component of a physical quantity (the frequency four-vector) 
therefore it is not a physical quantity in Special Relativity (the laws and the 
assumptions of a physical theory concern only the physical quantities of that 
theory!). 

Let us assume that * < lI, ws < 1 and expand 
Then (18.34) becomes: 


= Pie 2 u2 u2 ant (“) 
v2 — V1 = Vo a? a2) ae vo : 
uz uz u\3 
=»(5-5+0((2) )) 


which shows that in the second order approximation it is possible to measure the 
difference v2 — v, resulting from Doppler shift for all types of waves except for 
electromagnetic waves for which the difference: 


»n—vi=wo((*)’) 


which means that one has to consider third order terms. In Newtonian Physics we 
have: 


1 
u 
1-5 


in Taylor series. 


v u v 1 
+=(1+-), == 
W 


VO VO =F 


hence the difference: 
u u2 u u\3 
—v1= (14 fg I +0((=))) 
w w w w 
u2 u\3 
»(+0(())). 
w w 


that is, we find a difference in the second order terms with the relativistic Doppler 
shift. This difference holds for all waves provided that one neglects terms of the 
third order O ((4)° ; (4)’) . The above imply that it is possible (in principle) to 
determine with second order approximation the validity of the relativistic Doppler 
formula over the Newtonian one for all types of waves but the electromagnetic ones, 
for which the approximation is of the third order. 

As a particular realistic application we consider the sound waves for which 
w = 317m/s and calculate the entries of the following table for various speeds 


18.8 Relativistic Waves 665 


of X1, Xo: 


v (m/s) |\3 4 5 7 
4“)? }9x 10-5 |1.6x 10-4 |2.5 x 10-4 | 4.9 x 10-4 
3 


) 


y ~10720 ~10719 ~107!19 ~107!9 


~1079 ~10-8 ~10-8 ~10-8 


( 
( 
( 


Se |= 


We note that (>: < (4)? that is the term (4)? predominates therefore it 
is not possible to use sound waves to design an experiment for determining the 
validity of the relativistic Doppler shift over the Newtonian for standard speeds 
in the laboratory. This result emphasizes once more that for usual speeds in the 
laboratory the numerical results of the relativistic and the Newtonian experiments 
up to the second order of approximation practically coincide. 

For electromagnetic waves the equality vj = vz is exact. This is due to the fact 
that for light waves the group and the phase velocity coincide therefore the relativity 
of motion applies. The proof of the relation vj = v2 for electromagnetic waves is 
simple. Indeed in a profound notation one has: 


VO 1 tae 
vl) = —-—, = v0 
yas Le 
u LacF 
v2 = voYu (1+ =) = 7 = VI: 
Cc 1-= 


Example 18.8.3 A source emits plane waves of frequency vo and phase velocity 
w = wx. The source moves along the x—axis with constant velocity u = ux (0 < 
v < w). Along the x—axis there is a plane mirror normal to the x—axis on which the 
waves are reflected elastically. Calculate the echo frequency (that is, the frequency 
of the reflected waves in the rest frame of the source). Consider the special case of 
light waves. 

Solution 

Let O the observer at the proper frame of the source. The frequency four-vector of 
the incident wave for O is fo = (v0, “ 0, 0). Let O' be the proper observer of the 
mirror. The boost relating O, O’ gives for the frequency four-vector of the incident 
wave: 


form (m (0 EM) ne (SE En) 0.0) = wm (- 25-00). 


Cc W Cc 


The wave is reflected elastically, therefore the spatial part changes sign and the 
zeroth component remains the same. This means that for the observer O’ the 


666 18 Waves in Special Relativity 


frequency four-vector of the reflected wave is: 

u cou 
Srefl,o'. = YuY0 (1 Se a 0, 0). 
ww ec 


Using the boost relating O’, O we find the frequency four-vector in the proper frame 
of O: 


2 u u Cc u c u u u 
frept.o = ¥av0 (1 (<-=),-<+=+=(1-=),0,0). 
WwW WwW c c W 


Cc \W c 


The frequency of the echo is the zeroth component of the frequency four-vector. 
Therefore: 


2 
uu 2 
Vrefl,O = 1 —2—+ s| Vy YO- (18.35) 
w oc 
The phase velocity of the reflected wave is computed from the x—component: 


2 
Vrefl,OC c uu 
Wrefl,O WwW 


WwW w- 
Wrelf,O a c Uu u2 
—-£+4+24-—-=— 
w c cw 


We note again that the phase velocity w;esi,o is not computed from the phase 
velocity w with the relativistic rule of composition of 3-velocities. This must not 
worry us (see also footnote of Example 18.8.1) because the phase velocity is a 
geometric velocity not a particle velocity. 

In the case of an electromagnetic wave we have w = c and form the above 
relations we find”: 


Vrefl,o = ——Sv9 > (18.36) 


Wrefl,O =C. 


which implies that in the case of the electromagnetic waves the refleted frequency 
Vrefl,o < vo whereas the phace speed is c as expected. In this case the reflected 


5When the mirror is moving towards the observer O the velocity u —> —u and the relations we 


u 


: 1+ F é Fi - : 
derived become vyefi,0 = Tory while again w;efi,0 = Cc, that is the phase velocity w remains 


the same. This holds only for the photons whose kinematic and particle nature are completely 
equivalent or, equivalently, the phase velocity of the deBroglie wave associated (to be discussed 
below) with a photon and the ‘particle’ velocity (speed c) are equal. This is not true for other types 
of waves and certainly not for the deBroglie wave associated with a particle with mass. 


18.8 Relativistic Waves 667 


wave is red shifted. When the mirror is moving towards the emitter the v;¢¢1,0 > Vo 
and the reflected wave is blue shifted wrt O. 

For other types of waves we do not have a unique answer. Indeed from (18.35) 
we have: 


1 a a ,) (1 Hy" 424 (1 =) 2 
vy =|]— -2-— = vo = v 
refl,O y2 w c2 Vu 0 Cc Cc w Vu 0 


ll 
| es | 
— 
| 
als[ols 
+ 
N 
se 
a 
— 


therefore all options are possible. 

Second solution for the electromagnetic waves only. 

It is useful to give a second solution in the case of the electromagnetic waves 
using only the constancy of the speed of light. This will show the great advantage 
and the clarity of the considerations when one uses the frequency four-vector. 

We work only in the proper frame O of the source. The source emits light waves 
of frequency vo = 7 where 7p is the period of the wave. Let us assume that the 
time moment fg (of O) and while the mirror is at a distance A from the source the 
source emits a light pulse. If the velocity of the mirror is u = ux then the pulse is 
reflected on the mirror after a time period: 


A 


c—u 


dj = 
As the pulse returns to the source covers the same distance in a time period Dz = d 
due to the synchronization of clocks in Special Relativity. The time period between 
the emittance and the reception of the pulse by O is 


D=d,+d2. = 2d, 


hence the moment the pulse reaches O is 


A 
t=t+D=t+ : 
C—u 


A second pulse is emitted form the source the moment f; + T and (obviously) 
returns to the source the time moment (of O!): 


2(A+uT) 


C—Uu 


= (to +T)+ 


We compute the time difference tf — t; of the reception of the reflected pulses by 
O. We find 


2uT a 


ta-—t) =T+ 
aa c-Uu Ls 


668 18 Waves in Special Relativity 


We note that this difference is independent of the time moment fo and the distance 
A of the mirror form the source. This difference equals the period 7’ of the reflected 
wave (echo) in O.Therefore the frequency of the echo for O is 


i. 2 
C= or= ie" 
c 


Working in a single RIO we have computed a relativistic result! Where is the 
relativity? The answer is simple. The relativity is hidden in the synchronization of 
the clocks (chronometry) and in the fact that the speed of the reflected (light) wave 
is again c. It is the second factor which makes the solution possible only for the 
electromagnetic waves. For other type of waves e.g. sound waves we have to work 
with the frequency four-vector! 


18.8.2.2. The Transverse Doppler Effect 


When 6 = 1/2 the velocity of the observer is tangent to the wavefronts. In this case 
equation (18.31) gives: 


vy=yiv (18.37) 


that is the frequency it is blue shifted (vj > v). In Newtonian Physics y;} = 1 
hence there is no Doppler effect. Therefore the transverse Doppler effect is a purely 
relativistic phenomenon and its observation will enforce or annihilate the validity 
of Special Relativity. It is important to note that the transverse Doppler effect is 
independent of the phase speed of the wave hence one can use any type of wave to 
study its validity. 

Whereas the measurement of the radial Doppler effect is relatively easy the 
transverse Doppler effect requires higher accuracy. To show this let us consider the 
radial Doppler effect of light waves for redshift, that is (see (18.33))°: 


1 1 
vir =f RY = 0+ P| apy = +B) (14 56? + 04) v= + By + Oy 


from which follows: 


Vir — Vv 


= B + O(B?). (18.38) 


pox +--+ where |x| < 1. 


18.8 Relativistic Waves 669 


Similarly for the blue shifted radial Doppler effect we compute: 


=f + 016"). (18.39) 


We conclude that the radial Doppler effect is of first order in the parameter 6. We 
also note that: 


eS — rib _ 28 4 O(B?) (18.40) 


that is, the difference in the frequencies for ‘coming’ and ‘going’ is a first order 
effect in 6. The first experiment to measure this difference and confirm the 
relativistic Doppler effect was done in 1938 by Ives and Stilwell.’ 

For the transverse Doppler effect we have from (18.37) by Taylor expansion: 


_ 1 2» 4 
wu=C+ 52 + O(B"))v 


from which follows: 


Vi —v 


= 5h + 018") (18.41) 


that is, the transverse Doppler effect® is of second order in the parameter f. 


Example 18.8.4 Find all directions of the phase velocity of a wave wrt the relative 
velocity of X1, U2 so that the frequency of the wave is equal in both frames, that is, 
V1 = v2. Discuss the case of light waves. 
Solution 

The Doppler shift formula (18.31) for Xj gives: 


Uu Wi V1 
vy = y | 1 — — cos |v > cosd; = —{ 1— — }. 
Ww Uu yv 


Similarly for X» (18.31) gives: 


n= 7 (1+ = costs) v = cos = “(1 “= ¥ 608; 


TH. Ives and G.R. Stilwell, “An experimental study of the rate of a moving clock” J. Opt. Soc. Am 
28 215-226 (1938) and part I. J. Opt. Soc. Am. 31, 369-374 (1941). 

8To date, only one inertial experiment appears to have verified the redshift effect for a detector 
actually aimed at 90° to the object. See a. D. Hasselkamp, E. Mondry, and A. Scharmann (1979) 
b. ‘Direct Observation of the Transversal Doppler-Shift', Z. Physik A 289, 151-155. For a more 
recent account see Walter Kondig (1963), ‘Measurement of the Transverse Doppler Effect in an 
Accelerated System‘ Phys. Rev. 129, 2371-2375. 


670 18 Waves in Special Relativity 


where the minus sign indicates that the velocities of £1, U2 are ‘above’ and ‘below’ 
the direction of motion of the source and in the rhs we have replaced v2 = v, the 
common frequency of the wave in X1, X2. 

For light waves we have w; = w2 = c hence we find: 


1 V1 
cos 6; = — cos 62 = ri 1——}. (18.42) 


To reveal the kinematic meaning of this formula we consider the Euclidean 3- 
velocity space, in which we assume Cartesian coordinates with the positive x —axis 
directed along the relative velocity of X2 wrt X1. We have then 6 cos 6; = By, Bp = 
p2 + po + oe Replacing in (18.42) and assuming vy = v we find: 


l+f=V1-68?> 
(+B: +8, + 6+ 8 =15 
Dp A as 2 1 
(By + >) 7 5 Py + Br) = 5 => 
(Br +3)? B+ Be 
Q)  (#) 
2 2 
We conclude that in the 3-velocity Euclidean space the locus of the velocity 
points corresponding to the condition that the observed light frequency is the same 


in both frames (this condition is also called the zero redshift condition where the 
redshift z is defined by the equation | + z = a, is an oblate ellipsoid centered 


at the point (3, 0, 0). A cross section of this ellipsoid with the z—plane is shown 
in Fig. 18.6. The ellipsoid divides the velocity space in two regions. In the interior 
region we have v2 < vy (blueshift) and in the outer region v2 > v1 (redshift).? 


18.8.3 The Aberration of the Wave Vector 


As we have seen the Doppler effect has to do with the Lorentz transformation 
of the zero component of the frequency four-vector. The action of the Lorentz 
transformation on the spatial part of the frequency four-vector leads to another 
phenomenon which is called aberration. We have already studied the aberration 
of light in Chap. 9 where we were working with the four-momentum of the photon. 


°The interested reader can find more information in K. Gordon (1980), ‘The Doppler effect: A 
consideration of quasar redshifts‘ Am. J. Phys. 48,514. 


18.8 Relativistic Waves 671 


Fig. 18.6 The ellipse 
represents the locus of 
velocity points corresponding 
to zero redshift 


foword 


Here we shall derive the same results working with the four-frequency vector in 
order to emphasize that the aberration is a general wave phenomenon which is not 
restricted to light waves only. 

Consider two RIO &, X2 for which the frequency four-vector of a wave has 


analysis: 
ra ( V1 ) ( v2 ) 
oa ky >a on ko x2 


and assume that Xj, X2 are related by a Lorentz transformation with relative 
velocity u. Then the general Lorentz transformation gives: 


ney (v1 = Bi ki) (18.43) 
-1 2. 
ky = ki + E (B-k1) — y = B. (18.44) 


The first relation expresses the Doppler effect and the second the aberration of the 
direction of propagation of the wave, that is the different directions of propagation 
of the wave for the observers X1, X2. We write this relation in a more familiar form. 

We write 6-k; = Bk; cos, where 6; is the angle between the direction of 
propagation of the wave for % and the relative velocity u of &), X2. Then the 
fist relation reads: 


W=y (v1 _ 5 Bhi cos 61). 


672 18 Waves in Special Relativity 


Butk = % = oak where w is the phase speed of the wave. Replacing we find: 
u 
wW=yY (: -_-— c0s0) Vv} (18.45) 
WI 


which is relation (18.31). Concerning the second relation, we multiply both sides 
with 6 and get: 


y-1 2m, | 05 
B-ky =k, - B+ 5 (B-k1) — y Br. 
B c 
Next we write kj = 2m K, i = 1,2 and this relation gives after some simple 
algebra and using (18.45): 
7, Cos A1 — B 

i (1 — ch cos 61) 
This is the aberration formula valid for all types of waves. Especially for the 


electromagnetic waves in vacuum w; = w2 = c and the general formula reduces 
to: 


cos 62 = (18.46) 


ee (18.47) 
1 — Boos), 


which is the well known formula for light aberration. 


Example 18.8.5 Consider the RIO %), X22 as above and assume that , || refer to 
the direction of the relative velocity of X1, X2. 
a. Show that the phase velocity of a wave is transformed as follows!”: 


2 
wry 


w- 
WwW. = 5 witty | wy - su : (18.48) 
we c 
per (w: _ “u) 


b. Show that in the case the wave vector kj; = 0 (i.e. k; parallel to u) the: 


(18.49) 


10See R. Bachman Am. J. Phys. (1989), 57, 628. 


18.8 Relativistic Waves 673 


which coincides with the relativistic composition law for the 3-velocity w 
However one should note that this is only formal because w is not the 3-velocity 
of a particle. 
c. Consider w; = c, that is an electromagnetic wave in X; and show that w2 = c. 
d. Show that the Galilean law of transformation of the phase velocity is: 


1 
wea (1- —zu- vay) m (18.50) 
wy 
Solution 


From (18.44) we have (kj); = k; - u/w): 


2mv 
ko) = y G - 8) (18.51) 
ko, =ki (18.52) 
from which follows: 
2 > 2 2m V1 - 
ky =ky, +y~ (ky - B). (18.53) 
From (18.45) follows: 
1 
w=yYyvi (: = — zu: wii) (18.54) 
wy 


The phase velocity is defined in (18.21) as follows: 


w= —k Sw = —,k =—vw. (18.55) 


2 2 2 2 
2m vy 2nvy 2nv1 2nvy w 
i= 7 witty? 7 Will B) = z witty Wi zu : 
wy wy c wy c 


(18.56) 
Concerning the phase velocity in X2 we have: 


27 v2 
Ww. = —>— (ko. + ky\)) 
ky 


a fk +y (x: - mp) | 
ks 


674 


18 Waves in Special Relativity 


Qavy 2vy 1 wt 
= 3 a 1 zu: Wi|| wit ty | wy — zu 
ky wy wi c 
wiy 1 w? 
— 5 1 zu: WI witty {wiy- ze] |. 
2 w2 wy e 
tral iu) 


(18.57) 


. When kj, = 0 the wi, = 0 hence wy = wy). Then (18.57) gives: 


Wy 
Wi ou 


wt 1 1 wt 
WwW. = u-w wi-—u 
> 2 ) we PAE 


The w; = w;t and u = ud, hence: 


which is the relativistic composition formula for k;||K2||u. This result must not 
lead us to consider that the phase velocity can be handled as the standard particle 
velocity. To see this let us consider the case wj|; = 0, that is the transverse 
Doppler effect. Then equation (18.57) gives: 


wei wiy wt = Y wt 
= ri Wi ge" = one wi ea 
[wi + 2th Ls yp 


from which follows: 


2 2 
W2\| = E “lu. (18.58) 


ae) 
1+ y?p2—} c 


18.8 Relativistic Waves 675 


For a light wave this result reduces to: 


2 


Y 
= gage 


u. 


which is compatible with the 3-velocity relativistic composition rule (see also 
question c.) 
c. When w; = c (18.57) gives: 


ey 
Ww. = 


"| (1 a) [wit +y (wi —u)]. 


2 2 
[win +9? (wi) — 0 
The denominator is written as follows: 


2 
Wr + y? (wij) —u) =c— Wy) + y7 (wi) +y°= 2wi|| - U) 


=e + y*w + (y?- L)wy) — 2y?wi)|-u 
2.9 2» 2 
=cy +y rus eed wi\|-U 


: 2 
= 2y? (1 — ML) 


C2 
therefore: 
1 
¥2= way Wit + y (wi —u)]. 
Y (1 a) 
The length: 
1 1 wi uy2 

w3 = ay a) (wis + y? (wij) u)) soy" (1 _ ) =. 


2 wu 
new) 


d. To obtain the Galilean law of transformation of the phase velocity we set B = 0 
and y = 1 in (18.57) and find: 


wt 1 1 
Ww. = 1 zu: Wii [wit + wij] = 1— —,u- wy Wi. 
[wi +¥i,| bal w 


Exercise 18.8.3 The solid angle element dQ for a RIO is defined as follows dQ = 
sin 0d0d¢. Write dQ = —d(cos@)d¢ and assuming that the angle $ is measured 
in the plane which is normal to the relative velocity of two RIO X 1, X2 (hence 


676 18 Waves in Special Relativity 


dd = d¢2) use (18.47) to show that the solid angle is transformed as follows: 


rien 
He = (18.59) 
y* (1 — Bcos 61) 


18.9 Electromagnetic Waves in a Homogeneous and Isotropic 
Medium 


The electromagnetic field propagates also in matter however not in the same 
way as it does in empty space. The propagation of the electromagnetic field in 
a homogeneous and isotropic medium is modulated by the dielectric constant ¢ 
and the magnetic permeability 2 of the medium. In empty space (which may be 
considered as the limiting isotropic and homogeneous material) these quantities are 
denoted respectively 9, {49 and are assumed to satisfy the equation: 


1 
SEO. 


We have shown that for electromagnetic waves propagating in empty space the 
phase speed wg = c, therefore: 


(18.60) 


C= 


1 
JE0HO 


Within the medium it can be shown that the phase speed is given by the similar 
formula: 


wo = (18.61) 


1 
w= (18.62) 


Wa 


where ¢ is the dielectric constant and jz is the magnetic permeability of the medium. 
The index of refraction n of the medium is defined as the ratio: 


aa acca peal (18.63) 
solo ww 


It follows that in a homogeneous and isotropic medium the phase speed of light 
waves is given by the formula: 


(18.64) 


s190 


that is, the phase speed of the electromagnetic wave is smaller than c by the factor 
1 As a consequence the wave vector of an electromagnetic wave in a homogenous 


18.9 Electromagnetic Waves in a Homogeneous and Isotropic Medium 677 
and isotropic material is given by: 


k= nk. (18.65) 


Cc 


Consequently in a homogenous and isotropic material the frequency four-vector: 


“a v _ 1 
fi= (ne), =v (ne), (18.66) 


is a spacelike four-vector (in vacuum it is null because n = 1). Relation (18.64) 
effects formula (18.47) concerning the aberration for light. To find the new formula 
we consider (18.46) and use (18.64) to replace the phase speed. We find: 


ncos 0; — B 


—_———_—_——.. (18.67) 
1 — Bncos), 


NCOs 07 = 


Concerning (18.45) working similarly we find that in a homogeneous and 
isotropic medium the Doppler effect for light waves is given by the following 
formula: 


v2 = y (1 — Bncos@) vy. (18.68) 


The physics of the above results (many times referred as radiation kinematics) has 
as follows. At first we note that the transverse Doppler effect is the same for the 
propagation of an electromagnetic wave in vacuum and in a homogeneous and 
isotropic medium. This emphasizes the fact that the transverse Doppler effect is 
due only to the relativity of time intervals between events. 

For all other directions the propagation of light waves in vacuum and in a 
homogeneous and isotropic medium is drastically different. This is due to the fact 
that the term 1 — Bn cos 6; can be > 0, < 0 or = 0 depending on the velocity factor 
B and the medium (index of refraction). The case 1 — Bn cos 6; > 0 is similar to the 
vacuum and in X we have the normal Doppler effect. When 1 — Bn cos 6; < 0 the 
frequency vz < 0 which is absurd. However the change in the sign of the frequency 
can be absorbed in the phase e.g.: 


cos(—wt) = cos wf, sin(—wt) = sin (ot + a) 


Consequently the formula for the Doppler effect in a medium can be written: 


v2 = y|l — Bncos |v}. (18.69) 


It is an easy exercise to prove that when | — Bn cos 6 > 0 the 4 > 0, which is 


the same as the Doppler effect in vacuum, and when | — Bncos 6, < 0 the a <0 
which corresponds to a the ‘anomalous Doppler effect’. When | — Bn cos 6; = 0 the 


678 18 Waves in Special Relativity 


denominator of (18.67) vanishes and (18.68) gives v2 = 0. This is the condition for 
Cérenkov radiation. A charged particle moving in a medium of refractive index n, 
having no internal degrees of freedom, radiates within a cone of opening angle (the 
Cérenkov angle): 


1 
cos 09 = ra (18.70) 
n 


around its propagation direction. In the case of a neutral particle the condition 
1 — Bncos6, = O defines a cone in space called the Cérenkov cone!!. This 
cone divides space into two regions with respect to the observed Doppler effect. 
Outside the Cérenkov cone (defined by the condition 1 — Bncos6; > O) one 
observes the normal Doppler effect for which ~ > 0, as it is the case for vacuum. 
‘Within’ the Cérenkov cone (defined by the condition 1 — Bucos@; < 0) the 
am < 0 which corresponds to the anomalous Doppler effect. In order to produce 
Cérenkov radiation in a medium the particles of a beam must move at least as the 
Cc 


electromagnetic waves in that medium, that is 8B > ~. Concerning the Cérenkov 


angle its maximum value equals cos 09, max = i ie.B 1. 

In Fig. 18.7 it is shown the kinematic explanation of the Cérenkov radiation. 
A beam of charges emerges from an accelerator at relativistic speeds and it is 
directed to a screen which is placed at a proper distance (a few meters depending 
on the energy of the beam) from the point of emittance. The optical result which is 
observed (in X72) is a straight path (a ray) in the air and a disc of light on the screen. 
The explanation of these images is as follows. The ray of light in the air is due to the 
ionization and the excitation caused to the molecules of the air by the particles of 
the beam. The circle of light is due to Cérenkov radiation form the traveling charges. 
The opening angle of the right circular cone with vertex at the point of emission and 
base at the screen, equals twice the Cérenkov angle. What is happening is that the 
radiation emitted by the charges at the point of emergence intercepts the screen in a 
circular ring, which constitutes the outer portion of the disc of light on the screen. 
Charges nearer the screen emit radiation at the same angle and strike the screen in 
smaller concentric rings because they are nearer to the screen. Obviously the higher 
the velocity of the charges the smaller the radius of the light disc on the screen. 
In fact one method to calculate the speed of the emitted charges is to measure the 
radius of this disc and the distance traveled by the beam, from which one calculates 
the Cérenkov angle and consequently the velocity factor B. 


'IThe Cérenkov radiation is due to the polarization of the atoms of the material due to the 
passage of the charged particle. It is an electromagnetic shock-wave phenomenon, the direct optical 
analogue of the supersonic bang, or the bow wave from a swiftly moving vessel on water. It must 
not be confused with he Bremsstrahlung process which is due to the deflection of a particle (i.e. the 
acceleration) caused by the strong Coulomb field of the atomic nucleus. For a detailed explanation 
of the Cérenkov radiation see J. V. Jelley (1963) ‘Cérenkov radiation: its Origin, Properties and 
Applications’ The Physics Teacher I, 203-209. 


18.9 Electromagnetic Waves in a Homogeneous and Isotropic Medium 679 


Fig. 18.7 Cérenkov cone and radiation 


Example 18.9.1 A beam of electrons of energy 700 MeV is emitted from a linear 
electron accelerator and travels through air to a screen which is placed 12 m apart. 
Assuming that the index of refraction of air is n = 1, 00029 calculate the radius of 
the disc of light on the screen. Mass of electron 0, 5 Mev/c?. 
Solution 

The speed of the electrons is relativistic therefore we can approximate the total 
energy of the beam with the kinetic energy. This implies: 


E700 1 
2. S._Sf)4% 10'S fS./1—— eT, 
a ee P \ 7 


From (18.70) we calculate the Cérenkov angle: 


1 1 
0 = — = ——— =0, 9997 = = 4, 4°. 
em in 1,00029 =u 


The radius R of the screen of light on the screen is given by: 


1 1 
R = dtan6y =d,|—— -—1= 12x ,| ————_ ®_ 12. x2, 45 x 107? m = 0, 294 m. 
cos? 69 0, 99972 — I 


680 18 Waves in Special Relativity 


18.10 Center of Momentum of a System of Frequency Four 
Vectors 


In this section we consider the frequency four-vectors f;', f;' of two waves and 
determine the covariant quantities which they determine. 

We consider first the case that the four-vectors are parallel, that is fj) = kf/' 
where k > 0 is an invariant.'* Obviously this is a special case. Let u“ the four- 
velocity of an observer who observes both waves. Then: 


fyUa _ Kfi'ua. 


But —i fiug = va, A = 1,2 where vy is the frequency of the A wave observed 
by wu“. Therefore the condition of parallelism implies: 


_s 

VI 

This relation is important because it says that the quotient of the frequencies of the 
two waves observed by an observer is an invariant, although each frequency by itself 
it is not! 

A direct application of this result is the following. We consider a point source 
which emits a spectrum of waves with parallel frequency four-vectors while it is 
moving wrt to an observer whose four-velocity is u*. From the above result we infer 
that the quotient of any two frequencies of the spectrum is constant and independent 
of the observer. This means that if the Doppler shift is used by the observer u% to 
measure the velocity of the source, then any pair of frequencies will give the same 
result. 

In order to understand this let us consider a police car which has two sirens of 
different frequency which passes in front of an observer. The observer is possible 
to compute the velocity of the car independently for every siren. The above result 
says that the result of the two computations of the velocity of the car using the 
frequencies of the sirens must be the same, which is reasonable and expected. An 
important application of this result is in Cosmology (in which Special Relativity 
does not hold, but for photons we may allow ourselves an exemption!) where by 
means of the Doppler shift we estimate the speeds of the celestial bodies assumed 
to be point sources of electromagnetic radiation. The emitted electromagnetic waves 
been described by null frequency four-vectors are parallel because their spatial parts 
(direction of ‘sight’) are parallel. The existence of the invariant k indicates (for 
electromagnetic waves!) the existence of a physical quantity independent of the 
observer. This quantity is the spectrum of the source of the electromagnetic waves. 


Recall that in order two null four-vectors to be parallel it is enough that their spatial parts be 
parallel in one coordinate frame. Here we do not assume that the frequency four-vectors are null, 
that is, we do not restrict our considerations to light waves propagating in vacuum. 


18.10 Center of Momentum of a System of Frequency Four Vectors 681 


Especially for the electromagnetic spectrum we have that it is the same irrespective 
if itis emitted e.g. from the sun or from a source in the laboratory. For this reason it is 
allowable to place one spectrum besides the other and calculate in the laboratory e.g. 
the temperature of the sun (by comparing its spectrum with the radiation spectrum 
of a black body) or the speed of a stelar object. 

We consider next that the two four frequency vectors f/, fj are not parallel 
and examine the various covariant quantities they define. We define the two new 
four-vectors: 


Af =af} + bfy (18.71) 
Bo =cfi' + df; (18.72) 

where a, b, c, d are invariant quantities and impose the constraints: 
Pas, Rai, APB, HO (18.73) 


that is we demand iA‘ to be a unit timelike four-vector and B“ to be a unit spacelike 


four-vector. Physically 1a can be identified with the four-velocity of an observer 
and B® as a spatial direction in the 3-space of this observer. 
Solving (18.71) and (18.72) for fj’, fs we find: 


ft =< (dA* — 8") 


1 
f= x (—eA* + aB*) 


where we have assumed the determinant A = ad — eb ¥ 0. Conditions (18.73) give 
the equations: 


a? b? ab |. 
shit ahi t2Zhi fr =—! (18.74) 
ef td’ fy +2edf{ for =1 (18.75) 
aefy + bdfz + (ad + be) ff for = 0. (18.76) 


This is a linear system of equations in terms of the unknowns f, ie ice fi for. The 
determinant of this system equals — (ad — eb) # 0 by assumption. Therefore it 
has a unique solution, which is: 


1 
fr= qe ip) (18.77) 


1 
f= qe +a’) (18.78) 


682 18 Waves in Special Relativity 


fps arlde*e —ab). (18.79) 

These three conditions are not enough to determine uniquely the four coeffi- 

cients a,b, e,d, hence the four-vectors A“, B*, in terms of the three quantities 

f a roe fi for. We need more conditions. One new condition — whose justification 

will be made further down — is that the frequency of the waves which are measured 
by the observer A“ are equal. This condition leads to the equation: 


A’ fir = A’ for. (18.80) 
Using (18.77), (18.78) and properties (18.73) this condition gives: 
(de?) = (ec?) >d=-e. 
hence: 
B’ =e(fi — fr). (18.81) 
Still A® is undetermined and we need more conditions. We demand that the 


projections of the frequency four-vectors along the direction of B® will be of equal 
length and opposite direction. This new condition leads to the equation: 


B’ fir = —B’ for. (18.82) 
Replacing as previously we find: 
Ley = mer >a=b. 
A A 
hence: 


A’ = a( ft + ff). (18.83) 


However we have reached a contradiction. Indeed for these values of the parameters 
the determinant A = 0 contrary to our assumption. Furthermore we note that: 


1 
fi-R= qi -@) =0= fi =f. 
This motivates the search for different considerations. 


We assume the waves to be electromagnetic waves so that the frequency four- 
vectors are null i.e.: 


ff = ff =0. 


18.10 Center of Momentum of a System of Frequency Four Vectors 683 


Then A = 0 and it is not possible to express the frequency four-vectors uniquely in 
terms of the four-vectors A“, B“. However this does not bother us because we only 
want one solution of the four-vectors A“, B“ in terms of f;', f;' out of the infinite 
ones. We work as follows. 

From equations (18.74) and (18.75) we compute = as a = b therefore we 
have for the four-vectors A“, B@: 


{A =e(f{ + fs), B’ =e(f{ — fy). 


From the condition: 
a 
2h, Hf f. 2r 


we obtain finally: 


At = ————_ + B= ry. 18.84 
aR fs), rae fa) ( ) 


In order this answer to be acceptable it must hold —2 ff fo, > 0. Because 
—2f; for > 0 is an invariant it is enough to compute its value for one RIO 


only. We choose the proper frame of A“ and write A’ = i) ,_Bo= (°) ; 


fj =v (1) , for =v (1) where v is the common frequency of the photons as 


measure by A“ and e is the common direction of propagation of the photons in the 
same frame. The inner product: 


fi far =v (1) v (1) =-—2v <0 


therefore the solution we have found is acceptable. We have still to fix the sign of A% 
in (18.84). To do that we assume that A® is future directed hence its zero component 
must be positive. Then we have the unique solution: 


C2 


A’ = B= 
yaaa + fz), 


The above lead us to the following Theorem. 


sar Ff). 


Theorem 18.10.1 For every pair of photons with frequency four-vectors f;', f;' so 


2 a 
J -2fT for Ct 


that fi'° fra < 0 there is a unique observer with four-velocity A% = 


684 18 Waves in Special Relativity 


f3') for whom the photons have the common frequency v = rca | -5 fi for and are 


moving in the proper frame of A" in opposite directions which are determined form 


the unit spacelike vector B“ = ——( fr — f3'). The proper frame of A“ we 


 /=2fF far 


call the center of momentum of the two photons. Such a system does not exist for 
non-electromagnetic waves or for particles with mass. 


Theorem (18.10.1) emphasizes the fundamental difference between the electro- 
magnetic waves and the rest of the waves. 
We collect the above results as follows: 


¢ If two photons of different frequency are moving in parallel space directions in 
the 3-space of a RIO &, then they are moving so for all other RIO X’. If the 
frequencies of the photons for © and &’ are vj, v2 and vj, v4 respectively then 


re 
Mt A 


v2 ve" 

+ If two photons of different frequency are moving in different spatial directions in 
the 3-space of an observer © then there exists another observer &’, uniquely 
defined, for whom the photons have the same frequency and they move in 
antiparallel directions in the 3-space of the proper frame of X’. 


18.11 Waves and Particles 


An important implication of the frequency four-vector, perhaps as important as 
the relation E = myc? (equivalence of mass and energy) is the discovery of 
the deBroglie waves, which led to the development of Quantum Physics. In this 
section we introduce the deBroglie waves and prepare the ground for the proper 
understanding of the Physics behind the Schréndinger equation. 

As a tule, the discoveries of Physics are made in small steps, which indicate a 
new direction of understanding of nature, until someone makes a major step which 
changes dramatically the situation. In this respect behind any new theory of Physics 
there is a ‘history’ which — as it is the case with history — it is not made clear and fair 
until the fog of the prejudice has gone and the events are properly understood and 
have largely passed into the realm of the past. In the case of the deBroglie waves the 
history has as follows. In 1900 Plank in order to explain the spectrum of the black 
body radiation suggested that the electromagnetic radiation is emitted in ‘packets’ 
or quanta. This assumption did explain the totality of the spectrum (the existing 
theory could explain only the upper part concerning the higher wavelengths) but 
was considered to be a mathematical ‘trick* which ‘was working‘. The main reason 
was that at the time no ‘logical’ person could comprehend how radiation could be 
understood as discontinuous. Then, about the same time, a new phenomenon came 
along which also could not be explained by a continuous electromagnetic radiation. 
It was observed that when electromagnetic radiation was falling on a metal, (free) 
electrons were removed from the surface of the metal as follows: 


18.11 Waves and Particles 685 


a. There was a cutoff energy below which electrons were not observed 
b. The number of electrons (i.e. the anodic current) was proportional to the intensity 
of the radiation but not to the frequency. 


Later in 1905, Einstein using the assumption of Plank, suggested that the light 
is not only emitted in quanta but it propagates and it is absorbed in quanta. With 
this assumption he was able to explain the observational data by considering 
particle collisions between the light quanta with the free electrons of the metal. 
This phenomenon was called the photoelectric phenomenon. It became then clear, 
that the quantum of light was indeed a reality and not a mathematical token of 
theoretical Physics. Then an “electromagnetic particle” was introduced which later 
it was called ‘photon’, a term used since then uninterruptedly. Much later, in 1923, a 
young student Louis deBroglie in his Ph.D.!° thesis considered the inverse approach 
of Einstein, that is, as light has two natures i.e. wave and particle why particles 
cannot have also two natures i.e. be particles and the quanta of a wave? As we shall 
discuss below that led him to a contradiction which could be answered provided one 
was attaching to each particle a wave with phase velocity w > c in empty space. 
These waves which are not electromagnetic but of a different nature (not specified 
by Louis deBroglie) are similar to the photons and differ in that their phase velocity 
w differentiates from their group velocity — which equals the velocity of the particle 
u— and in such a way so that the product of the speeds wu = c?. For light waves 
w = u = c therefore the formula applies to all types of particles i.e. photons 
included. The introduction of the deBroglie waves was possible only in Special 
Relativity and specifically by the use of the frequency four-vector. In order to prove 
the validity of the Louis deBroglie approach one had to show that particles satisfy 
the two major wave phenomena, that is the Doppler effect and the phenomenon 
of interference. These phenomena were indeed observed verifying Louis deBroglie 
considerations. Soon afterwards, Schréndinger put Louis deBroglie’ approach in 
proper mathematical language and introduced the Schréndinger equation which was 
the foundation of Quantum Mechanics. 

It is to be noted that the photoelectric phenomenon gave Einstein a Nobel prize 
and the deBroglie waves gave Louis deBroglie the same prize (the first time to a 
Ph.D.). Both men were young when they put forward their views, a fact which must 
be considered thoughtfully. 


18.11.1 deBroglie Waves and Light 


In order to understand the deBroglie waves we start from the familiar case of the 
quantum of the electromagnetic field (i.e. the photon). Consider a RIO & in whicha 


'3 Louis deBroglie (1923) Compte rendus, 1977 pp. 507-510 and (1924) PhD Thesis “Resherches 
sur la theorie des Quanta“ University of Paris. 


686 18 Waves in Special Relativity 


photon as a particle has four-momentum p; = £( 1, €)y and as a wave has frequency 


four-vector ff = vd, k)y. These two four-vectors are independent and describe 
two different kinds of physical entities. Indeed the second satisfies the Doppler 
phenomenon which makes no sense for the first, while the first satisfies the law of 
conservation of four-momentum which makes no sense for the second. On the other 
hand the two four-vectors p', f! have common characteristics. For example both 
are null vectors and are defined in © from a 3-direction (the é, k) and one component 
(the energy E and the frequency v) respectively. If we consider that the 3-direction 
is common, that is € = ak, then a relation between their zeroth component leads 
to a relation between the four-vectors.'* Working along this line Einstein defined 
é = k and used Plank’s hypothesis 


E=hv 


to relate the two four-vectors as follows: 


h 
Four-momentum of photon: = — x frequency four-vector 
c 
pa fi (18.85) 
c 


In this formula the coupling constant relating the two four-vectors contains the 
Plank’s constant h and the speed of light c both universal constants, hence 
(relativistic) invariants. The lhs contains the information of the photon — wave and 
the rhs the information concerning the photon — particle.!> 

In Newtonian Physics the relation between the elements specifying the wave and 
the particle character of a photon are less obvious and brake necessarily to two 
independent relations as follows: 


hv 
p= — E=hyv. (18.86) 
However the quantities v, c are not Euclidian invariants and one runs into problems, 
but we shall not consider this subject further. 
Relation (18.85) is covariant because it contains only four vectors and (relativis- 
tic) universal constants, therefore although defined for a specific RIO &, it holds for 
all RIOs. 


'4Recall that a null vector in a RIO ¥ is defined in terms of one scalar A° and a unit 3-direction é 
by the formula A°(1, e)y. 


‘Universal constants always relate physical quantities of different nature. 


18.11 Waves and Particles 687 
18.11.2. deBroglie Waves and Particles 


A question which arises quite naturally, and was posed by Luis deBroglie in his 
Ph.D. is: 


The Doppler phenomenon is general and applies to all phase speeds in the interval [0, +00). 
Furthermore the conservation of four momentum holds for all particles with speeds in the 
interval (0, c]. Is it possible to extend (18.85) to particles with non-vanishing mass, and if 
‘yes* under what conditions? 


This question lead him immediately to a contradiction!® which we discuss below. 

Suppose that we wish to associate a wave with the particle. As we have seen the 
quantity defining a wave is the four frequency vector. Because the wave which will 
be associated with the particle will be (eventually) related to the four-momentum, 
we demand that: 


¢ The frequency four-vector shall be a timelike four-vector and 
¢ The frequency four-vector will share the same proper frame with the rest of the 
particle timelike four-vectors i.e. the x, p“ etc. 


The second requirement demands that we have to define a ‘proper frequency‘ vo 
for the particle and a proper wave vector kg. It is reasonable to accept that in D* 
these quantities shall be defined by means of the relations: 


hvo = mc’, ko =0 


where m is the (proper) mass of the particle. Then the particle frequency four vector 
in the proper frame &* of the particle is defined to be 


ca Yo 
— A a 18.87 
( 0 ) oa 


By postulating that this is a four vector we have postulated that this is a potential 
physical quantity in Special Relativity. Experiment will prove or disprove if this 
claim is valid or not! 

We cannot use the same procedure in order to find the frequency four vector in 
another RIO &. To do that we must use the appropriate Lorentz transformation. Let 
us assume that the frequency four vector in & is 


fix fae) (18.88) 
Ok = 


'6See J. Haslett ‘Phase waves of Louis deBroglie‘ (1972) Amer.J. Phys. 40, 1315-1320. This paper 
is a translation of the first chapter of Louis deBroglie’s thesis. 


688 18 Waves in Special Relativity 


and that the particle’s velocity in © is u. Then the Lorentz transformation gives: 


Vv =voy (18.89) 
k= yBvo. (18.90) 
W 


In order to compute the phase speed of the deBroglie wave in & we consider 
the invariance of the length of the frequency four vector and taking into considera- 
tion (18.89) we find 


sd 2 
Ig a men (18.91) 


Relation (18.91) determines the phase speed of the deBroglie wave in & in terms of 
the speed of the particle in & and the universal constant c. Obviously w > c. 
Finally the frequency four vector of the particle in & is as follows: 


a Py moe (1) 18.92 
f =r (g) <7 n \p), (18.92) 


We note that the frequency four vector of the particle is defined only in terms of the 
particle’s data in &. 

To calculate the group velocity of the deBroglie wave in & we consider that 
at tT = O the wave has a profile xo in the proper frame of the particle i.e. it is a 
point (contrary to the Heisenberg Principle of Quantum Mechanics!). Considering 
the boost relating £*, X we compute: 


c 
x=yxo, ct=ypx>x= A 


The group velocity in & is: 


dx Cc 
v= — =—-—=u (18.93) 
dt|r=0 +B 


that is, the particle velocity. From (18.91) follows that the product of the phase 
speed and the group speed of the deBroglie wave is constant and equal to c*. We 
note that this relation is compatible with the result that the phase velocity of light 
waves equals the group velocity of photons c, hence their product equals c*. 

Let us see now the controversy discovered by Louis deBroglie. The frequency 
v= + where T is the period of the wave (or of a periodic phenomenon in general). 
Then (18.89) gives T = a whereas by Lorentz time dilation T = yt! How these 
can be reconciled? The answer is simple. The frequency of a wave is determined by 
the phase velocity whereas the Lorentz time dilation is due to the particle velocity 
i.e. the group velocity. These two quantities are different for deBroglie waves hence 


18.11 Waves and Particles 689 


they cannot be used at the same time. In other words the period of a periodic 
phenomenon is not necessarily the time difference of two events along the same 
worldline. The same situation holds for light waves for which the phase speed is c 
hence y = oo and Lorentz time dilation does not apply. In conclusion: 

The deBroglie waves associated with a particle have a phase velocity w = ou 


where u is the velocity of the particle and a group velocity v = u. The frequency of 


the deBroglie wave in the proper frame of the particle equals as and the frequency 


four-vector is a timelike four-vector which in the RIO & has components: 


v 
fi= (3) (18.94) 
w >>) 
where e|{u. 
The space part: 
ve , ve vC 
e=—u=—4u=-u 
w wu c c 
therefore: 
a _ v _ mc ye mC 4 _ & a 
f =(24). h (70). ho he 
or: 


pt = —f". (18.95) 


We note that the frequency four-vector of the deBroglie wave is proportional 
to the four-momentum of the particle, the constant of proportionality being the 
universal constant h a relation which is compatible with relation (18.85) for 
photons, hence applies to all particles! 

We write this relation in terms of components in &. We have: 


E/c _hfv 
myu <8 Gers 


from which follows, that for the deBroglie wave: 


E=hv (18.96) 
wv =C. (18.97) 
We summarize the above results as follows: 


The deBroglie wave which is associated with a particle of (proper) mass m and 
velocity v in a RIO & has: 


690 18 Waves in Special Relativity 


1. Frequency v given by the relation E = hv 
2. Phase velocity w = ey. Waves with phase speed < c cannot be deBroglie 
UV 
waves. 
3. The frequency four-vector is timelike with length: 


i fo i mc4 
TI poe De 

More specifically for the deBroglie waves we have the following general 
characteristics. 


a. The deBroglie waves are determined only in terms of their frequency and phase 
velocity but not with their amplitude. This freedom can be used to describe 
particles with the same mass but different physical characteristics. For example 
the electron and the positron with spin 5 and -5 are described with different 
deBroglie waves. 

b. Particles with different mass are described with different deBroglie waves. 
Because let p{, p5 be the four-momenta of the particles and let f“ be the 
common four-frequency. Then: 


i h i i 
P= e = Ppp > mM, =™m2. 


c. The frequency of a deBroglie wave depends on the RIO it is considered. That is, 
if a particle has energy E in the RIO © then the frequency of the deBroglie wave 
in © (!) is v = E/h. In another RIO D’ in which the energy of the particle is E’ 
the frequency of the deBroglie wave is v' = E’/h 4 v. 

d. The wave length of a deBroglie wave of frequency v and phase speed w in the 

fy h h he 


myc[h — myv ~ IPl geo 2a 
In the proper frame of the particle w = 0 hence A = oo and A diminishes as 
he 


the energy of the particle increases. For a photon A proton = c = FA. This 
means that if we use particles (i.e. deBroglie waves) instead of photons, it is 
possible to increase the resolving power of an instrument. This result has led to 
the development of the electronic microscope. 

e. The dualism of the nature of all particles (including photons) has effects and 
in dynamics. The equations of motion of Mechanics which concern particles 
are produced by the Principle of Least Action (Hamilton equations) whereas 
the equations of propagation of a wave follow from Fermat Principle. Therefore 
the two Principles must be related and we assume that the equations of motion 
of particles with mass are included in the ‘equations of motion’ of the waves 
when the frequency of the associated deBroglie wave is small. The Physics 
of waves leads to Quantum Physics (Schréndinger’s equation is the Hamilton- 
Jacobi equation) which could not be developed if the deBroglie waves were not 
previously introduced. 


RIO & is given by the relation A = 2 = 


18.11 Waves and Particles 691 


Example 18.11.1 (Bohr’s quantum condition) Consider a particle of mass m which 
moves around a circular orbit with constant speed u. Then the (total) energy of the 
particle is a constant of motion therefore the frequency v of the deBroglie wave 
associated with the particle is constant. The phase velocity of the deBroglie wave is: 


oa ic 


w= —ju= —€, 
i @ 


where ¢ is the polar angle. The phase speed w is independent of time therefore the 
deBroglie wave is a standing wave. The wavelength A of this wave is: 


w he 
A= = = constant. 


v E2 — m?2c4 


Because the wave is a standing wave there must be an integer number of wavelengths 
along the trajectory 27 R where R is the radius of the circular orbit. Therefore we 
demand the quantum condition: 


nhc 

2x R =nkd = ————_.. 
E? — m?c# 

This condition means that possible energy levels are the ones for which the 
wavelength of the deBroglie wave is an integer submultiple of the length of the 
circular orbit. This condition is known as quantum condition of Bohr and has 
been used in early Quantum Mechanics to explain the energy spectrum of the 
hydrogen atom H. Later Schréndinger improved this model by considering an 
electron moving under the action of the Coulomb field of the nucleus. In this case 
the orbit is an ellipse which precesses and leads to the fine structure of the energy 
levels of the hydrogen atom. 


Although the association of a particle with a deBroglie wave has been done 
within a rigorous and concise mathematical model, it is not necessary that nature 
agrees to all that. That is, we have still to see if nature does treat a particle as a 
wave. To show this one has to verify that particles do posses the wave properties. 
One such major property is the interference of waves. 

An electron with usual speed corresponds to a deBroglie wave with frequency 
in the region of X rays. Therefore if we direct a beam of such electrons in a 
hole (whose diameter is near the wavelength of the deBroglie wave for better 
results) we expect to find (not see, because the waves which interfere are deBroglie 
waves!) a distribution of the electron density corresponding to the compressions and 
rarefactions of the corresponding deBroglie waves. 

Three years after Luis deBroglie asserted that particles of matter could possess 
wavelike properties, the diffraction of electrons from the surface of a solid crystal 
was experimentally observed by C. J. Davisson and L. H. Germer of the Bell 


692 18 Waves in Special Relativity 


Telephone Laboratory. In 1927 they reported!’ their investigation of the angular 
distribution of electrons scattered from nickel crystal. With careful analysis, they 
showed that the electron beam was scattered by the surface atoms on the nickel at 
the exact angles predicted for the diffraction of x-rays according to Bragg’s formula, 
with a wavelength given by the de Broglie equation, 4 = h/mv. Just as Compton 
showed that waves could act like particles, the results of Davisson and Germer 
showed that particles act as waves. 

Later in 1927, G. P. Thomson reported his experiments, in which a beam of 
energetic electrons was diffracted by a thin foil. Thomson found patterns that 
resembled the x-ray patterns made with powdered (polycrystalline) samples. This 
kind of diffraction, by many randomly oriented crystalline grains, produces rings. 
If the wavelength of the electrons is changed by changing their incident energy, the 
diameters of the diffraction rings change proportionally, as expected from Bragg’s 
equation. 

The experiments of Davisson and Germer and those of Thomson proved that the 
deBroglie waves are not simply mathematical conveniences, but have observable 
physical effects. The 1937 the Nobel Prize in Physics was awarded to them for their 
pioneering work. 


Example 18.11.2 The electron gun consists of a heated filament that releases 
thermally excited electrons which are accelerated through a potential difference 
V (giving them a kinetic energy eV where e is the charge of the electron). The 
experiment of Davisson and Germer consists of firing an electron beam from an 
electron gun on a nickel crystal at normal incidence (i.e. perpendicular to the surface 
of the crystal). The spacing of the crystalline planes of nickel is kmown from X-ray 
scattering experiments on crystalline nickel and it is equal to d = 0.091 nm. 

Assuming that the kinetic energy of the electron beam is 54 eV examine if these 
data are consistent with Bragg’s Law of defraction if the electron detector has to be 
placed at an angle 9 = 50°. 
Solution 

According to the deBroglie relation, a beam of 54eV has a wavelength of 
0.165 nm. Bragg’s law states that for normal incidence: 


0 
ny = 2d sin (0° _ *) 


where n is an integer, d is the spacing of the crystal planes and @ is the angle of 
diffraction. Replacing the given data we find: 


n Xx 0.165 = 2 x 0,091 x sin(90 — 25) > n = 0, 912. 


'7See C.Davisson, L.H. Germer (1927). “Reflection of electrons by a crystal of nickel”. Nature 
Vol. 119: 558-560 and C. Davisson (1928) “Are Electrons Waves?,” Franklin Institute Journal 
205, 597. 


18.11 Waves and Particles 693 


It follows that the deBroglie waves are consistent with Brag’s law for n = 1. 


The second important wave phenomenon is the Doppler shift. The way to 
demonstrate the Doppler shift of deBroglie waves is to prove the reconciliation of 
the Doppler effect with the conservation of four-momentum. This is shown in the 
following example. 


Example 18.11.3 Three particles 1,2,3 of masses m,, m2, m3 react as follows 1 + 
2 — 3.InaRIO & in which the energies of the particles are E,, E2, E3 respectively 
calculate the frequency four-vector of the deBroglie wave of particle 3 and show that 
the Doppler relation is satisfied if we consider as observers the particles 1, 2. 
Solution 

The frequency four-vector of the deBroglie wave of particle 3 in & is: 


j i; ¢ ) c { E3/c EB ( c ) 
_f£ 3 _¢ Je ae 18.98 
B= APs ae ps Jy h\ us)” he \vs), aoe 


Conservation of four-momentum gives: 


£3 =E£,+fo> (18.99) 
po Era Ey Fe 
fat ®) : 
(6 V3 yp 


The velocity is found from the conservation of the 3-momenta: 


(E, + Ep) E\ E> E\v, + Ev 
a Vv Ww > V3= —_—_.. 18.100 
5 Bag ie ag aS E, 4 Ep ( ) 


Cc 


Let ul, =YI 6 ) : us, =y i) be the four-velocities of particles 1, 2 in X. 


Then the frequencies of the deBroglie wave of particle 3 wrt the particles 1, 2 are: 


(Ei Es) 


3,1 = fs = pe LM V3 — C7) 
(E) + Ep) 
V3.2 = fzvai = Fe Yann va “V3 — C7). 
Dividing we find: 
V3,1 Y1 Pe ue 
eis == (18.101) 


694 18 Waves in Special Relativity 


The phase velocity w3 of the deBroglie wave of particle 3 is w3 = ys, Therefore: 


vi - V3 vi - U3 v2 ° V3 v2 U3 
Ce ou” Co : 


U3 


Replacing in (18.101) we find the Doppler formula. 


We note that the conservation of four-momentum for the reaction: 
1424-3 142'+--. 
in terms of deBroglie waves is written: 
Re ers Ge cane 


According to this form of the conservation law, the reaction of particles is 
understood as a superposition of deBroglie waves (one such wave for each particle) 
and the production of other deBroglie waves. For example in the reaction we 
considered the deBroglie waves of the particles 1,2 in the frame & ‘interact‘ 
(=superimpose) and produce a new deBroglie wave which is the one defined by 
particle 3 in &. 


Example 18.11.4 Show that necessary and sufficient condition for the quotient of 
the components two four quantities to define a Lorentz invariant is that the four 
quantities are four-vectors. 
Application 

Assume that the dualistic view for the wave and the particle character of mater 
are valid and that the energy E of the particle view corresponds to the frequency v 
of the wave nature with the relation E = hv where h is a universal constant. Then 
deduce (18.91), which relates the four-vectors of frequency and four-momentum of 
a relativistic particle. !® 
Solution 

Assume that in a RIO & the four-quantities (not-necessarily four-vectors!) 
(ao, 41,42, a3) and (bo, bj, bz, b3) are such that a4j = Ab;, i = 0,1, 2,3. Let 
x’ be another RIO which is related to © with the Lorentz transformation Lis 
Under the action of the Lorentz transformation the four-quantities (ao, a1, a2, 43), 
(bo, b1, bz, b3) become (aq, ay, az’, a3’), (bo, by, bz, bz) and are related as fol- 
lows: 


— zi _ zi 
ay = Liaj bj = L;,bj. 


'8See R. Newburgh (1956), Lett. Nuovo Cimento 29, 195-196, P. Dirac (1924) Proc. Cambridge 
Philos. Soc. 22,432. 


18.11 Waves and Particles 695 


Assume that the quotient ee i = 0,1, 2,3 is invariant hence independent of the 
particular RIO computed. Therefore in &’ we must have a; = Abi i = 0,1,2,3 
where A is a Lorentz invariant. This implies: 


(ay, ay, ay", a3") = Alby, by, by, bx) > aj = Ab; = AL! bj = Lia) 


that is, the four quantity a; is a four vector. Since A is Lorentz invariant the same 
holds for the four quantity b;. We conclude that the four-quantities a; and b; are 
four-vectors which are also parallel. 

The inverse is obvious. 
Application. 

The particle character of a particle in a RIO & is described by the four- 
momentum p; = (E/c, p)s. Concerning the wave character of the particle we know 
that a (relativistic) wave is characterized by: 


— The amplitude A(x!) 
— The frequency (v) 
— The phase velocity w or the wave vector k. 


These quantities are related as follows: 
F(x!) = AG) PQ! fi) 


where f; = (v, *k)» is the frequency four-vector of the wave, P (x! fi) is the phase 
of the wave and x’ is the point in spacetime where we study the wave. The amplitude 
A(x’) is a Lorentz tensor depending on the tensorial character of the wave. 

The dual description of the physical nature of the particle is achieved by 
the introduction of a relation between the frequency four-vector and the four- 
momentum. Plank’s relation E = hv defines a correspondence between the zeroth 
components of these four-vectors. We define a correspondence between the spatial 
components of these four-vectors by demanding that they are parallel. Then the 
correspondence of the zeroth component demands p = hk where k =*k is the 
wave vector of the wave. These imply the relation 


Conversely one obtains Plank’s relation if it is demanded that the 3-momentum p of 
the particle is parallel to the wave vector of the wave and specifically that p = hk. 
We note that in Special Relativity the deBroglie wave is defined by one covariant 


relation (the p! = A f') whereas in Newtonian Physics by two (the E = hv and 


p= BvCk), which express at the level of Dynamics the concepts of absolute space 


and absolute time. 


696 18 Waves in Special Relativity 


Example 18.11.5 When Special Relativity was at its infancy Einstein in order to 
“explain‘ the equivalence of mass and energy proposed the following simple thought 
experiment.!? 

An exited nucleus which rests in the laboratory (L) falls in its ground state by 
emitting two photons with the same frequency in opposite directions along the 
x—axis. Consider a frame & which moves in the standard configuration with speed 
u in the laboratory. Compute the 3-momentum and the energy of the system of the 
two photons in &. Assuming that the nucleus stays still after the emission of the 
photons show that the nucleus has reduced its mass by Am = 5 where E is the 
total energy of the two photons in the laboratory. 

Solution 

The frequency four-vector of the photons on the proper frame of the nucleus 

(=laboratory) are respectively: 


1 1 
1 -1 
clad Ct ea 
O/, O07, 
In & these four-vectors become: 
ee 1+2 
: —¥ ‘ ae poe 
Des c Pg c 
=v” 4 fy =v 0 
0 L 0 L 
Therefore the 3-momenta of the photons in © are: 
hv, h Vv 
P= i=-yv (1 ) i 
c 
h h 
pP2 = eal —=yv (I + ~)i 
c c c 


'9See A. Einstein (1950) Out of my later Years” (Philosophical Library N.Y. and A. Einstein (1979) 
‘A Centenary Volume’ edited A.P. French Harvard University, Cambridge, MA, 319). 


18.11 Waves and Particles 697 


The total energy of the photons in & is: 
E=hQ, +12) = 2hyv 


and the total momentum: 


2hyv E 
P= P\— p2= 7 Y= — BU. 
c c 


Assuming that the nucleus stays still in the laboratory after its transition we have 
Pp = 0 which makes no sense. In order to explain this paradox and also restore the 
conservation of the 3-momentum of the system one assumes that 


p= Anmvi 
where Am is the necessary reduction in mass of the nucleus. In the laboratory the 


3-momentum p = Ee i hence —Am = 4 where — Am indicates reduction of mass. 
This relation indicates the equivalence of mass and energy. 


Example 18.11.6 An excited nucleus makes a transition to its ground state by 
emitting a photon as shown in Fig. 18.8. Let & be the proper frame of the nucleus 
and let x — y be the plane defined by the 3-vectors pi, p2, k where k is the wave 
vector of the photon. We define in & the quantities: 


uy -k : 
0; = yi (1 - ) i=1,2 
ixk 
w= i=1,2 
Cc 
ur, 1 
where y; =(1- 3) Zi=1,2 
Fig. 18.8 Transition of a y A 


nucleus 


698 18 Waves in Special Relativity 


a. Prove the relations: 
M,®; = Mo9. =a 
MY, = Moo =5 


for all initial and final 3-momenta and energies. 
b. Show that the following identity holds: 


1 2 W2 
Vi= 56, A+ e+) 
I 


and from this deduce that the energy of the produced photon in & is given by: 


1 
M? — M3 
as i 5) 


hy = — 
2 


where M1, M2 are the masses of the nucleus before and after the transition. 
c. Let vp be the frequency of the photon which is produced by the transition of the 
atom and Eo the average energy of the states of the atom. Show that: 


Fo 
v= —vo. 
a 


d. If v; (resp. v5) is the frequency of the photon in the proper frame of the nucleus 
before (resp. after) the transition, show the relation: 


Solution 


a. Conservation of four-momentum gives: 


oe aa Pee 
Pi /»s p/y c \K/y 


E; — Eo =hv (18.102) 
hv~n 

Pi — po = —k. (18.103) 
Cc 


We decompose the second equation parallel and normal to the direction of k 
and find: 


18.11 Waves and Particles 699 


h 
Miyiv1 -k—M2y2v2 -k = ae (18.104) 
Cc 


My, v1 sin 9, — Mzy2v2 sin @2 = 0 (18.105) 


where M1, Mp? are the masses of the nucleus before and after the transition. 
We set: 
WV; = yu; sind; = y; |v; xk| i=1,2 
ok 
; = yi(1— ) i=19 
Cc 
and equations (18.104) and (18.105) are finally written: 


MY, = MoV. =5 
M,®, = Mo.®. =a. 


We note that the quantities a, 6 are not Lorentz invariant quantities, but they are 
simply scalar quantities whose value depends on the RIO they are computed. 


b. We have: 


2 
U3 2v; cos 0; 
OF +07 = y+ 4- att 
Cc Cc 


Y; Vi 
1 2 2 
i 


c. From (18.102) we find: 


hv ye oes 14+ 05+ 5 


c2 ~ 2®P, 2 2®7 
LT gg 2, 92 1 2 Diy. R0: 
= 5 (Mi +a? + 8°) — (My +0? +8) 
1 
= 5, (Mi — M3). 


700 18 Waves in Special Relativity 


If the nucleus before the transition (resp. after) is at rest in the laboratory, that 
is we have a source (resp. an observer) at rest in the laboratory, then ®; = | and 
a = M, hence: 


_ a 2 qg2) 2 
hy, = (My — M3)c’. (18.106) 
M, 
If the nucleus after transition is at rest in & then a = Mp) and we have: 
hv. = + — M2)c’. 
M2 


We note that in each case the frequency of the photon depends only on the mass 
of the nucleus before and after the transition and not on the initial velocity before 
or the final velocity after the transition. This shows that the emission of a photon 
is independent of the velocity of the emitter or the receiver. 

d. In order to prove the compatibility of the Doppler effect with the conservation 
of four-momentum it is enough to show that the frequency vj computed 
form (18.106) when it is Doppler transformed from the exited nucleus 1, which 
we consider to be the source, in the proper frame of the nucleus 2 after the 
transition which we consider to be the observer, coincides with v; which is given 
by relation (18.32). Indeed we have: 


hv, M2 D> 


vy My Oy 


Example 18.11.7 Consider a relativistic plane wave which in the RIO & has period 
T. The phase surfaces of this wave in spacetime are 3-d hyperplanes normal to the 
time axis of © and at a distance T apart (see Fig. 18.9) 


a. Describe the phase surfaces of the wave for a RIO &’ who moves wrt © in the 
standard way with speed uw. 


Fig. 18.9 Wave surfaces of a ta 
ins : t! 

relativistic plane wave in X 

and ©’ £ 


18.11 Waves and Particles 701 


b. Show that there are two frequencies for a periodic phenomenon in &’, one 
associated with the particle nature of a particle whose proper frame coincides 
with X’ and one with the phase surfaces in &’. 

c. Compute the phase velocity of these waves assuming that the wave vector k of 
the wave associated with the 3-momentum p of the particle (the deBroglie wave) 
are related by the formula k =+p. 

d. Show that the group velocity of the deBroglie wave associated with the particle 
equals the velocity u of the particle. 


Solution 


a. The wave surfaces in & are defined by the condition ct =constant. These are 
straight lines (we ignore the y, z coordinates) parallel to the x—axis. Under the 
action of the boost relating &, X’ these lines transform to lines parallel to the x’ 
axis as shown in Fig. 18.9. The angle @ between the two families of planes is the 
rapidity of 5, X’ given by cosh¢ = y. 

b. Let (OA) = Tp be a proper time interval for X’. For © this gives the coordinate 
time (OT) = T. From the triangle OT A we have: 


(OT) = (OA) cosh¢ = y(OA) > T= yTo 


which is the time dilation formula (of course this can be proved by the direct use 
of the boost equations). The frequency vo of a periodic phenomenon in ©’ with 
period Ty appears in & as a periodic phenomenon with frequency: 


Consider now the point B on the world line of & which is defined by the 
intersection of the phase surface at T’ with the world line of &. Then: 
(OB) = (OT) — (BT) 
= (OA) cosh¢ — (TA) tanh¢ 
(OA) cosh ¢@ — (OA) sinh¢ tanh¢ 


cosh? @ — sinh? 


= —————— (0A) 
cosh @ 

_ (OA) (OA) _ To 

“ coshh@ yy 


We may define a new frequency vy = OB as the number of phase surfaces 
per unit of time. Then we have: 


Vij =YVo. 


702 18 Waves in Special Relativity 


c. Which frequency one should consider? This same question was posed by Luis 
deBroglie in his Ph.D. thesis in 1924. The answer is both! The first (which is 
related to the time dilation) has to do with the particle nature of the particle and 
involves the particle velocity. The second (which is concerned with the phase 
surfaces) has to do with the wave nature of the particle and the velocity it involves 
is the phase velocity of the deBroglie wave associated with the particle. Therefore 
we fix the phase speed of the deBroglie wave so that the two considerations are 
compatible. 

For a Newtonian wave the wave vector k is related to the 3-momentum p of 
the particle by the relation: 


peo 


But k =e p where w is the phase velocity of the wave. Then: 


vic ¢ 
— =—myu. 

h Y 
Replacing the phase frequency vj = yvo we find that the phase speed of the 
deBroglie wave is associated with the particle speed uv in a RIO & as follows: 


c Vo, 


v= - = 


B y 
d. The group velocity U of the deBroglie wave is given by: 


d dy 
be AOC) ok ois Soke Me a RS SB Be, 
Udy a) Re By +y  cy(By?+1)  ¢ ¥? 
ap 


that is, the particle velocity in X. 


Chapter 19 m) 
The Physical System Continuum al 


In order to understand properly the relativistic fluids it is best to study them in 
comparison with the Newtonian fluids, provided the latter are approached from a 
relativistic point of view. In the following we restrict our discussion to Newtonian 
Physics and Special Relativity, however in a way which generalizes naturally to 
General Relativity where this topic is of fundamental importance. For the shake of 
clarity and completeness we might need to repeat some basic concepts of previous 
chapters, however in a slightly different spirit. 


19.1 The Manifold Structure 


We consider an aggregate of particles which occupy a connected region in space. 
These particles constitute a physical system which we call a deformable body. As 
it is the case with all theories of Physics, in order to study a deformable body as a 
physical system we have to introduce physical quantities, which characterize one or 
more physical aspects of the system. In order to do that one needs to identify each 
physical quantity with a geometric object. Each theory of Physics uses different 
types of geometric objects whose nature is defined by the covariance principle of 
the theory. In Newtonian Theory this Principle requires that the physical quantities 
shall be expressed in terms of Euclidian tensors and the whatever laws which will 
be considered in that theory shall relate only Euclidian tensors. Similarly in SR the 
Covariance Principle requires that the physical quantities of a continuum shall be 
described by Lorentz tensors and the laws which will be proposed in that theory 
must contain this type of tensors only.! Therefore in GR the covariance principle 


'In General Relativity there does not exists a finite dimensional group acting globally on all 
the spacetime manifold therefore the group of transformations consists of all differentiable 


© Springer Nature Switzerland AG 2019 703 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_19 


704 19 The Physical System Continuum 


requires that the physical quantities of a continuum will be described by general 
tensors and the proposed laws shall relate general tensors only. 

All theories of Physics study the evolution (or in general the change of the 
state) of the systems in a background space whose points are assumed to contain 
information concerning the state of the system. Specifically for the theories which 
study the motion of physical systems the background space consists of points 
describing position in 3D space and a parameter, the “time”, which is enumerating 
the various states of a physical system. 

In a bare set of points there is no much one can do in order to study the motion 
of a physical system and we have to introduce additional mathematical structures 
which shall make possible the description of the state of the physical systems in 
terms of numbers. Without getting too far into the subject we assume that the 
background space has the structure of a differentiable manifold. Indeed the key 
concept of manifold structure is the coordinate system which allows the description 
of the points of the background space in terms of coordinates (numbers) and this 
is done in a way that if a point lies in two coordinate systems then there exists a 
differentiable transformation which relates the two sets of coordinates. 

The manifolds we shall consider are real which means that locally they will be 
parts of the linear space R” where n is the dimension of the manifold; the latter 
been determined by the (minimum!) number of coordinates required by the physical 
theory in order to describe the state of the physical system. 

In the following we shall consider the Newtonian Physics and the Theory of 
Special Relativity. In both theories the manifold is the linear space R* in which the 
coordinates used are (as a rule) one coordinate for the “time” and three coordinates 
for the “space”. Furthermore it is assumed that in both these theories the manifold 
they use can be covered by a single chart only, that is, a coordinate system covers 
all space.” 

To appreciate the importance of the manifold structure consider a physical 
system which moves and two observers observing the motion of the system. As 
we have already explained in Chap. 2 the description of the motion of the physical 
system by an observer within a theory of Physics is done by means of certain 
apparatus and certain procedures of using that specific apparatus. The result of the 
observation is the coordinates of the state of the system by the observer of the 
theory. The coordinates of a certain state of a physical system are different for 
different observers. However the existence of coordinate transformations allows 
the “communication” of the observers (within the specified theory of Physics) 
so that each observer may transform his/her observed coordinates to any other 
observer using the appropriate coordinate transformations of the fundamental (i.e. 
the covariance) group of the theory. In that sense the state of the system is observer 


transformations. This group is infinite dimensional and it is called the manifold mapping group 
(MMG for short) see J L Anderson “Principles of Relativity Physics” Academic Press 1967. 

?In technical terms this is equivalent to say that the manifold structure of both theories has zero 
curvature or it is flat. This is not the case with General Relativity. 


19.2 The Geometry Defined by the Metric 705 


independent within a physical theory. If this exchange of “information” is not 
compatible with the coordinate transformations of a theory then the evolution of the 
system cannot be described within that theory.* This is the reason that we need many 
theories of Physics in order to describe the various physical systems/phenomena. 


19.2 The Geometry Defined by the Metric 


Even the manifold structure is not reach enough for the study of the evolution 
of physical systems. It can be used only for qualitative general studies involving 
fundamental aspects of a physical theory. However in Physics we need to quantify 
the evolution of physical quantities and for that to be possible it is necessary to 
consider additional geometric structures defined on the manifold structure. These 
additional structures are defined by means of various geometric objects. When this 
is done we say that the manifold is a space or it has a geometry. 

The first such geometric object — and common to all theories of Physics proposed 
so far — has been introduced for the first time on a scientific basis by the ancient 
Greeks and it is the metric. Euclid introduced the Euclidian metric and defined 
the Euclidian Geometry which was the key science for many centuries. It was the 
geometry of the world as we perceive it with our senses, therefore its fundamental 
character could not be disputed. Due to this, this Geometry was used by Newton in 
the construction of Newtonian Physics. A few centuries after Newton a different 
metric was introduced which, however, was motivated by the Physics of the 
“cosmos” of the particles as this is perceived by machines and not directly by the 
human senses (see also Chap. 2). This is the Lorentz metric, which was introduced 
by H. Minkowski and defined the Minkowski (or hyperbolic) Geometry on a four 
dimensional manifold. The use of this metric is used by the Theory of Special 
Relativity which had been developed earlier on physical grounds by Albert Einstein. 

The consideration of the Lorentz metric was a major step because disassociated 
Physics from the direct sensory observation. The acceptance of Lorentz metric was 
perhaps facilitated by the fact that a few decades before Minkowski, Riemann had 
generalized the concept of metric as a general two index tensor on a n—dimentional 
manifold and developed the Riemannian Geometry.* It was inevitable then after 
the incapability of Special Relativity to describe the gravitational phenomena that 


3This does not mean that a theory which cannot “explain” the physical phenomenon is wrong. 
Simply the case is that this physical phenomenon does not belong to the set of physical phenomena 
that the certain physical theory can explain. For example the constancy of the speed of light is 
not a Newtonian phenomenon. This does not mean that Newtonian theory is wrong! Simply that 
phenomenon belongs to the set of phenomena of another theory and this is the reason that we had 
to develop the Theory of Special Relativity. 

4The new concept introduced by this geometry was that of the covariant derivative, the latter being 
defined in terms of the metric (Riemannian connection). Today the Riemannian geometry has been 
generalized to include new geometric objects such as torsion and metricity. 


706 19 The Physical System Continuum 


Einstein used the Riemannian Geometry in order to develop the Theory of General 
Relativity in 1916. 

From the above it is apparent the parallel development and the close relation/ 
evolution of Geometry and the theories of Physics. Although this is a fascinating 
subject we do not intent to discuss it further because it is outside the scope of 
this book. The background space R* enriched with the manifold structure and the 
geometry defined by the metric we call spacetime. 

It is important to note that the spacetime of Newtonian Physics (which we shall 
denote with E*) and that of Special Theory of Relativity (which we shall denote with 
M*) both can be covered by one chart (i.e. one coordinate system). Technically this 
means that both the Euclidian metric and the Lorentz metric obtain their canonical 
form> diag(1, 1, 1, 1) and diag(—1, 1, 1, 1) globally, that is in all spacetime. 


19.3. The Structure of the Classical Theories of Physics 


There are two levels at which a physical theory operates in spacetime. The first 
concerns the study of the world lines per se of physical systems in spacetime and 
it is called the kinematics of the theory. The second concerns the interaction of the 
world lines with the environment and defines the dynamics of the physical theory. 
Kinematics is more fundamental in the sense that it is possible to have theories with 
different dynamics but the same kinematics. For example one may propose (and 
that has been done) a different gravitational law in Newtonian Physics using the 
same kinematics of the standard Newtonian theory. In the following we consider 
the kinematics and the dynamics of Newtonian theory and Special Relativity with a 
double purpose. First to lay down properly the ground for the physics of deformable 
bodies and second in order to prepare the ground for the passage to General 
Relativity, where one deals as a rule with sets of world lines (fluids). 


19.4 Geometry and the Covariance Principle 


There is no need to restrict our considerations to the flat spaces E? and M* therefore 

we shall continue the discussion of this section assuming Riemannian geometry. 
The basic tool introduced by the metric is the distance between two points, 

defined as follows. Consider two points | and 2 in a Riemannian space and connect 


them with a smooth curve x!(s) where s is a parameter along the curve. Along the 


curve one defines the tangent vector field u! = ae whose length is given by the 


5The metric has its canonical form if all entries of the matrix representing the metric are +1. In 
Riemannian Geometry and consequently in General Relativity the canonical form of the metric is 
possible only at one point. 


19.4 Geometry and the Covariance Principle 707 


expression |u| = ,/|g;ju'u/|. Depending on the character of the metric the length 


|u| can be positive or zero. A transformation which preserves the length |u| is called 
an isometry. The /inear isometries form a finite dimensional Lie group. 
Let us assume that |u| #4 0. Then it is possible to consider the class of curves 


which connect the points 1, 2 for which the integral /, \/lgijuiu/|ds is an extremum 
(maximum or minimum depending on the character of the metric). These curves we 
call geodesics. 

The geodesics are used in all classical theories of Physics in order to associate 
Geometry with the kinematic quantities of the theory. They have a double role: 


a. To define a class of special observers (the inertial observers) and 
b. To define a characteristic group of transformations which preserves the geodesics 
and it is used to define the type of tensors of the theory. 


In a flat space the geodesics are called straight lines and are expressed by 
linear functions of u!. The linear isometries which preserve the geodesics (and the 
length |u|) form a group which is identified as the covariance group of the physical 
theory.This group is used by the Covariance Principle in order to define the type 
of geometric objects of the theory. Each theory of Physics has its own physical 
quantities which are described by means of the geometric objects of the theory and 
constitute the set of the real phenomena the theory can “describe”. There is not a 
physical theory which explains all physical phenomena. In addition to the definition 
of the covariance group of the theory the geodesics are used to relate the geometry 
of spacetime with the Physics by identifying the geodesics as the trajectories of a 
special class of observers of the theory called the inertial observers.° These two 
roles define the kinematics of the theory. 

In Newtonian Physics the straight lines are identified with the trajectories of 
the Newtonian Inertial Observers (NIO see Chap. 3) and the Jinear transformations 
which preserve the straight lines of three dimensional Euclidian geometry generate 
the Galilean group of transformations. The latter defines the Euclidian tensors which 
are used in the description of the Newtonian physical quantities and the statement 
of the Newtonian physical laws. 

In the case of Minkowski space the timelike straight lines are identified with 
the world lines of the Relativistic Inertial Observers (RIO see Sect. 4.5.4) and the 
group of linear isometries which transform one timelike straight line to another is 
the Lorentz group. The straight lines of zero length are identified with the world 
lines of photons (zero mass particles in general) and the ones with positive length 
are identified with the field lines of the various fields (electric field, magnetic field 
etc.). Every other but a straight timelike line in Minkowski spacetime is identified 
with the world line of an accelerated relativistic observer. The study of these two 
congruences of lines consists the kinematics of SR. 


Tn General Relativity they are identified with the observers in free fall in the gravitational field. 


708 19 The Physical System Continuum 
19.5 Kinematics: The Connecting Vector 


Kinematics concerns the study of the world lines (“trajectories”) of observers in 
spacetime. Equivalently one may say that the “trajectories” are the objects of 
kinematics. 

Mathematically the trajectory of a particle in spacetime is a parameterized curve 
where the parameter is the ‘proper’ time of the particle (photons excluded). A 
parameterized curve can be described by its first and second derivative, therefore 
the Kinematics of a continuum involves the velocity and the its first derivative along 
the curve. On the other hand the description of the “relative motion” of a set of 
parameterized curves is done by means of the derivative u”. It is therefore postulated 
that the kinematic description of a deformable body involves two differentiable 
vector fields in the region occupied by the continuum: 


a. The velocity vector field u” 
b. The derivative vector field u“”. 


The Kinematics of a deformable body is precisely the geometric study of these 
two vector fields. 

The study of a vector field is done in various ways. The one closer to geometry 
is by means of the study of its flow. The flow of a (differentiable) vector field is the 
congruence of all integral curves of the vector field. These curves are parameterized 
with the arc length s (it can be any other parameter the scenario stays the same) 
and every member of the congruence can be characterized with n quantities, where 
n is the dimension of the space where the orbits are traced. To understand the role 
of n we consider a point P in space and the integral curve (the orbit) through P. 
Given the first and the second derivative of the curve at P one is able to compute 
(i.e. trace) the curve near P. Therefore one can characterize the curve (orbit) by 
the coordinates of P and the (affine) parameter s along the curve. If in a coordinate 
system the coordinates of P are y* a = 1,2,...,n then the set (s, y”) characterizes 
the curve of the congruence through P. 

However a single curve does not give information about the continuum and one 
has to consider the comparative study of two curves. This study is realized by means 
of relative motion as follows. 

Consider the curve through the point P and a second integral curve through a 
point Q near’ the point P. Choose the parameter along the second curve so that at 
the points P, Q the value of the parameter along each curve is the same and consider 
the vector PQ! joining the points P, Q. The vector PQ! we call the connecting 
vector at P. Then define a vector field along the integral curve through P by joining 
points along these integral curves always for the same value of the parameter. This 
procedure defines a ‘transportation’ law of the connecting vector along the integral 


7The word near has to be defined. However for the case of Newtonian Physics and the Theory of 
Special Relativity where the underlined geometry is flat the usual concept of vector is enough. For 
a general Riemannian space this is not so and one has to define it accordingly. 


19.6 A Brief Detour in Vector Analysis 709 


curve through P, which we call Lie transportation along the tangent vector of 
that curve. Because P is arbitrary this vector field defines a new vector field within 
the continuum, which we call the connecting vector field. The study of this field 
together with the velocity field and the acceleration field constitute the Kinematics 
of deformable bodies in a physical theory. 

In the following we study the Kinematics of deformable bodies first in Newtonian 
Physics and then in Special Relativity. Before we do that for the convenience of the 
reader we must review briefly some relevant properties of Euclidian vectors in R?. 


19.6 A Brief Detour in Vector Analysis 


In the current section we discuss various identities. We start using the language 
of the standard vector analysis and then we continue with the tensor formulation, 
which greatly simplifies the results. The definitions which follow are not restricted 
to a three dimensional Euclidian vector space and apply to an n—dimensional (real) 
vector space. With minor changes they also apply to Special Relativity. 


19.6.1 Vector Formulation 


Consider a general vector function f(x, y, z) defined over a connected region of R°. 
We define the directional derivative (u - V)f of f(x, y, z) along an arbitrary vector 
u by the relation: 


a 0 a 
(u- V)f= E os + Uy By + uz |e (19.1) 


For f = r and all vectors u we have the identity: 
(u-V)r=u. (19.2) 
For later use we note the obvious result: 
Vxr=0. (19.3) 
Next we discuss the decomposition of an arbitrary vector u relative to the position 


vector r. As it will be shown this decomposition is useful in the study of relative 
son8 
motion 


8This decomposition can be done straightforwardly using the 1 + 2 decomposition we developed 
in Sect. 12.2.2. However for the convenience of the unfamiliar reader we do it in this section using 
standard vector formalism. 


710 19 The Physical System Continuum 


We start with the well known identity of vector calculus: 
Viu-v)=ux(Vxvwt+vx(V xut+(u-V)v+(v-V)u (19.4) 
and take v = r to find: 
Viu-r)=ux(Vxr+rx(Vxuw+u-V)r+(r: Vu. (19.5) 


The first term vanishes by (19.3) and the term (u- V)r = u by (19.2). There- 
fore (19.5) gives: 


u=V(u-r)+(V x w xr—-(r- V)u. (19.6) 
This relation decomposes an arbitrary vector u into parts relative to the position 
vector r. 
We find a new identity if we take u = v in (19.4): 


Vv? = 2vx (V x v)+2(v-V)v (19.7) 


or, solving for the directional derivative (v - V)v: 
a: 
s'VIV 5 VN —vx(V xv). (19.8) 


Consider a general vector function f(x') i = 1,...,n defined over a connected 
region of R”. Then the differential df of f is defined as follows: 


of of = 
df =—dx! + —dx?+...= 
ax! . sare: ali d, 


of 


n 
; : 
ait = bp ia 2, | f= (dr-v)f 
i=l 
(19.9) 


that is, the differential of f(x') coincides with the directional derivative of 
f(x‘) along the vector dr =(dx!, dx”, ..., dx"). 


19.6.2. Tensor Formulation 


We write the above formulae using tensor (i.e. index) notation. We use Greek indices 
and assume that they take the values 1, 2, 3. 
The exterior product u x v in tensor formalism is written as’: 


(u x v)? = 6H Py vy (19.10) 


°Binstein convention applies to all tensor notation. 


19.6 A Brief Detour in Vector Analysis 711 


where ¢°” is the well known completely antisymmetric tensor (or Levi Civita 
tensor).!° Then (19.3) follows trivially!! from EPPEX, y =0. 
The inner product u - v is written as: 


u-v=u"u,. (19.11) 


The directional derivative is the vector (u“d,) f?. 
Concerning the identity (19.4) we have: 


(6  upvy)s« = OY Usk vp + Upvy,x) 


= 54 (51 50 — 5¢ 59, U po Vp + 5HY (50 8e — BR 8F Up oly $V" Ux,y +U" U¢,y. 
The term: 
5S Se —8¢ On, Up,o vy = Dey eh ba ave = Scape” epg = Ex ne? Up,o v4 
therefore: 
(Cae Eg per? Up g VY Eyre Pr? Up ct TV" Ux, y tl” Ux,v- (19.12) 


To show that this expression is indeed identity (19.4) we note that in standard 
vector notation the terms: 


—Eeprl Pup ov" = Healy uv! = [vx (V x wy 


igi =l[(v- V)u), 


from which identity (19.4) follows. Note that for v‘=u" this reduces to iden- 
tity (19.7), which in tensor notation reads: 


(pte) ie = Expr err up uh + Qu", p. (19.13) 
Finally for the differential of functions we have: 


df? = fedx*, (19.14) 


10We recall (see Sect. 13.10.1) the identities (Greek indices take the values 1, 2, 3): 
EPP eng = oy = Cyan - 575, 

EPP ey, = 26° 

Els vp = 31 =6 

Euvp A" BYC? = det(A, B, C). 


'l Because X,v is symmetric in j, v. 


Chapter 20 
The Physics of Newtonian Deformable al 
Bodies: Newtonian Fluids 


20.1 The Kinematics of a Newtonian Deformable Body 


We could start this section by considering directly the connecting vector and derive 
all the results. However the average — certainly the new — reader is not familiar with 
this concept and would possibly think that all is a matter of mathematical exercise. 
For this reason we shall start with the traditional velocity picture which is familiar 
to all and then we shall rederive the results using the connecting vector. Perhaps we 
should emphasize that in Special and especially in General Relativity the kinematics 
can be discussed only in the connecting vector approach. 


20.1.1 The Relative Motion in Vector Formulation 


Consider a region R of the linear space R? occupied by a deformable body. Choose 
the coordinate system ¥ in R with origin at the point P. Let Q be another point 
of the continuum near P, where by near! we understand that the position vector 
r =xi+yj+zk of Q in ¥ is such that squares x7, y”, z? of the coordinates are 
ignorable. Let v be the velocity vector field of the deformable body and suppose 
that the value of the field at the points P, Q of the deformable body at the moment” 
t are vp and vg respectively. Because the vector field is a differentiable vector field 
we have: 


Vo =Vvp+dvp (20.1) 


'The assumption of nearness is necessary because otherwise there is no a linear connection 
between the velocities of the points P, Q. Here we have the first requirement for the definition 
of nearness. 

Note that we work on Newtonian Physics therefore time is absolute i.e. the same for all particles. 
This is not the case in relativistic theories. 


© Springer Nature Switzerland AG 2019 713 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_20 


714 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


where the quantity dvp is the differential of the velocity field at the point P. The 
quantity dvp we call the relative velocity of Q relative to P. According to (19.9) 
the differential dvp of the velocity function vp is given by the relation: 


dvp = (r- V)Vp (20.2) 


where we have replaced dr with r due to the nearness hypothesis* we have made. 
We decompose dv p along r using the general identity (19.6) and write: 


dvp =V(dvp-r) + (VxXdvp) X r— (r- V)dvp. (20.3) 
The term’*: 
(r- V)dvp = (r¥- V)(r- V)vp = (©¥- V)Vp = dvp (20.4) 


and equation (20.3) becomes: 
1 1 
dvp = 5 ave) + ay xdvp) x r. (20.5) 
Replacing in (20.1) we find 
1 1 
vo =vp+ Phd xdvp) x r+ av - dvp). (20.6) 


This is an equation where all terms are defined at the point P except the “free 
variables” r,dvp which are used to “measure” the relative velocity and the 
“nearness” of the points around P. This equation is in vector form, that is, the same 
for all Newtonian coordinate systems. 

If we introduce the vorticity vector @ by the relation: 


1 
o= a (¥xdve). (20.7) 
the velocity of the point Q is written: 
1 
Vo=Vpt+ Waxes vVdye): (20.8) 


3The replacement of dr with r means that we linearize the problem. That is relation (20.2) 
defines the dvp! The continua which satisfy this requirement are named Linear Continua. Not 
all continua are expected to be linear! In general we should write 


dvp= (r- V)vp + O(r?). 


‘Tt is easy to prove that (r- V)(r- V) = (r- V) by writing (r- V) = xe + ye + ze. A shorter 
proof is the following. We write x0, for the term r- V and have: , 


(x* 9, )(x?d,)v? = (x" 57, dy )v? +x x9, 00° = (x"9,)v? + O(x¥ x”). 


20.1 The Kinematics of a Newtonian Deformable Body 715 


20.1.2 The Relative Velocity of Linear Continua 
in Tensor Notation 
The velocity field at Q is: 
v5 = vp td. (20.9) 
Assuming that the continuum is a linear continuum we write: 
av oF dx. (20.10) 
Then we have: 
vo = Vp + dup = vp + up dxy 
=vpt ule"lax, + oe dx, 
= ve, + (805Y — 8°52) 0S dxy + VO dx, 


o,nr iv 
= vp t Exon Pug dxy + vl? lds, 


where vpy,p is the derivative of the velocity field at the point P. We define the 
vorticity tensor: 


wf” = egg ve (20.11) 
and get: 
v6 = vp + oP dxy + vp? day. (20.12) 


Next we define the symmetric (0,2) tensor: 


ef! = ye) (20.13) 
and obtain the final answer: 
v6 = Vp + wPYdxy + eP'dxy. (20.14) 


With the antisymmetric tensor w,, we associate a vector w”, called the vorticity 
vector by means of the relation: 


1 
ot = Be Op > ow? = 66a, (20.15) 
In terms of the vorticity vector the relative velocity is written: 


v9 = vp + ef @odxy + eP’dxy. (20.16) 


716 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


20.1.3 The Kinematic Interpretation of Newtonian 
Relative Velocity 


Equation (20.8) (or (20.16)) implies that the velocity of the arbitrary point Q near 
the point P can be described by the sum of three terms: 


e The velocity vp which corresponds to a motion in which all points near P 
(including P) are displaced by vpdt. This type of motion we call translation. 

¢ The velocity xr which corresponds to a motion in which P and the points along 
the line parallel to ® which passes through P remain fixed. This type of motion 
we call rotation about the axis w with angular speed w. 

¢ The velocity 5V(r-dv p) which represents a motion in which P is the only point 
which remains fixed. This type of motion we call strain. 


The first two motions are defined in terms of the vectors vp, w and the last 
motion in terms of the quantity dvp-r, which is defined by the tensor e®” = we 
(see (20.13)). 

A motion which is either a translation, a rotation or a combination of the two we 
call a rigid motion. 

Newtonian deformable bodies which can move only with rigid motion we call 
rigid bodies or solids. A deformable body is possible under certain conditions to 
move like a rigid body, but in general it does not do so. Another special type of 
Newtonian deformable bodies are the Newtonian fluids, which are defined as the 
Newtonian deformable bodies, which sustain strain motion only. 

We note that the motion strain is possible only when the material is deformable, 
that is the distance between two points of the material changes in time. 

We picture equation (20.8) by considering the effect of each type of motion on a 
deformable body in the form of a sphere (see Fig. 20.1). 


Fig. 20.1 The three possible sera 
types of relative motion translation 


”~ strain 
motion 


20.2 Newtonian Kinematics of a Linear Deformable Body in Terms of the... 717 


The geometric explanation of Fig. 20.1 is the following. 

Translation is a motion such that all the points of the sphere (including its center) 
have the same velocity and the distance between any two points remains constant. 

Rotation is a motion in which the center of the sphere and all points along the 
diameter parallel to the direction of @ (the axis of rotation) remain fixed, while the 
rest of the points of the sphere change position in such a manner that their distance 
from the axis of rotation and the distance between any two points of the sphere is 
constant. 

For these two types of motion no change in the shape or the volume of the sphere 
occurs. 

The third type of motion is one for which only the center of the sphere remains 
fixed and the shape of the sphere changes e.g. to a (rotational) ellipsoid. For this type 
of motion (in general) the distance between all the points of the continuum changes. 

From the discussion so far we may draw the following important conclusions 


1. Equation (20.8) (or better (20.16)) is covariant with respect to the Galilean 
group of transformations, therefore describes an intrinsic property of the New- 
tonian deformable bodies independent of the particular Newtonian Inertial 
Observer. 

2. In the above approach the classification of the Newtonian deformable bodies is 
done according to the type of motion they are able to sustain and not according 
to their constitution. 

3. There are three types of Newtonian deformable bodies: 


a. Rigid Bodies: They can sustain rigid motions only 

b. Newtonian fluids: They can sustain strain motion only and 

c. Newtonian deformable bodies: They can sustain all types of motion simultane- 
ously. 


20.2 Newtonian Kinematics of a Linear Deformable Body 
in Terms of the Connecting Vector 


In this section we discuss the Newtonian kinematic interpretation of (20.8) of 
a linear deformable body in terms of the connecting vector instead of relative 
velocities. 

We consider two nearby> points P, Q ina Newtonian deformable medium which 
we assume that suffers a small deformation so that the points P, Q are displaced to 
the points P’, Q’ respectively. Let s = PP’, s’= QQ’ be the displacement vectors of 
the points P, Q. Our purpose is to calculate the new vector P’Q’ = r’ in terms of the 


5Nearness is understood as in the last section. That is, P is the origin of the coordinates and the 
position vector rg of the point OQ has components rg = xi+ yj+zk where the quantities x, y, z 
are small in the sense that their mutual products can be neglected. 


718 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


Fig. 20.2 The geometric ; Q’ 
description of a small . : 
: P 
displacement 
s s' 
P Q 


original vector PQ = r and the displacement vectors s, s’. Because the points P, Q 
are arbitrary the connecting vector is a vector field defined in the internal region of 
the deformable body. 

We assume that the deformation of the body takes place in time dt with the 
velocities of the points P, O being vp and vg respectively. We have then: 


s'=vodt, S = vpdt. 


Replacing vg from (20.8) we find®: 
P 1 1 
s=s+ al¥ xdvpdt) x Pe, aypat): (20.17) 
The linearity of the space gives (see Fig. 20.2): 
j ; 1 1 
r=r+s-s=r+ 5 (Vxdvp)xrdt + ~V(r- dvp)dt. (20.18) 


The first two parts correspond to a rigid body motion. Indeed the first term is a 
translation by r. Concerning the second term we note that it is written as @ x rdt 
where @ is the vorticity vector, hence this term encounters rotation. The strain 
motion is described by the third term 5V(r -dvp)dt. 

Next we consider the strain motion. 

Due to the differentiability of the velocity and other fields associated with the 
deformable body, the distance dvpdt must be the differential (i.e. the s’ — s) of the 
change s(rp) suffered by the point P. Therefore we have the condition’: 


dvpdt =ds(rp) 


from which follows r-dvpdt = r-ds(rp). 
To compute the strain motion explicitly we consider a coordinate frame, the 
x say, in which we assume that r = (x, y,z), V = (u,v, w). Then from (19.9) 
we have (if we replace r with dr due to our linear approximation): 
OVp 


dvp =r-—_ 
VP at 


Do not overlook the fact that we consider linear deformable bodies only! 


7This condition defines the linear deformable body at the level of relative position vectors, thus 
completing the definition we gave previously in terms of the relative velocity. 


20.2 Newtonian Kinematics of a Linear Deformable Body in Terms of the... 719 


i.e. the change of the velocity in the direction of r. The inner product: 


du dv 
r|p or 


i: pow 
P ‘ or 


/ 


3 Ou 2 dv ie dw 
= x — xy— xZ— 
Ox P vax P Ox P 
4 ou 4 4 Ov i ow 
yx y yz 
dy|p ~ dy|p ~ dy |p 
RY Ou re Ov rv 2 Ow 
ZX Zz Zz 
dz |p az P OZ |p 
from which follows: 
1 0 0 0 0 Ci) 
<vr-ds(rp)) = | x— +2(2 )+i(e a ) dti 
+/3 (> dv )+ dv +i (> 4 uv ) au 
2 oy P Ox P yay P 2 dy P Oz P J 
0 0 0 0 r) 
+[5(Se1,+ al.) 3 (Gt del.) tae | 
2 Ox P Oz P 2 oy P Oz P Oz P 
= [ey] [x ]dt 


where [x*] = (x, y, z)! and we have introduced the strain tensor (see also (20.13)): 


1 ou dv 1 ow ou 
1 (# aT ag \a(e + az ) 
P P P P 
1 ou dv 
[euv] =] 9 (s ) ay 


; : 
l{a l{a a a 
P P P P 
(20.19) 


An equivalent way to look upon the matrix [e,,,] is the strain ellipsoid defined 
by the equation: 


ou 
Ox 


du 
+ ox 


du 
+ 3 


2, du 
T 


x 
P ay 


y 
P az 


We note that the strain tensor has dimensions [T]~!. 


720 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


The strain tensor defines a positive definite quadratic form i.e. a metric, therefore 
it is always possible to choose at the point P the coordinate frame in such a way that 
the matrix [e,,,] becomes diagonal. These axes correspond to the principal axes of 
the strain ellipsoid and are called the principal directions of the strain ellipsoid. 
In this frame along the frame axes (only!) the strain motion is a pure translation (the 
simplest possible!). The translations are in general different along each principal 
direction, therefore [e,»] = diag(e1,e2, e3) with e),e2, e3 different to each other. 
Finally we note the useful relation: 


Trigal= > a= is += avs (20.21) 
ll Le 


20.3. The Motion of a Linear Newtonian Deformable Body 
in Terms of Geometry 


In the previous section we considered the relative motion of a linear Newtonian 
deformable body from a kinematical point of view. We found that the motion 
consists of two different types of motion (a) rigid body motion and (b) strain 
motion. In the following we relate these types of motion with a metric with the 
view to “geometrize” the general motion of a linear Newtonian deformable body. 
We consider first the rigid body motion. 


20.3.1 The Geometrization of Rigid Motion 


Consider the two points P, Q in a linear deformable body whose position vectors 
at some instant® ¢ are r and r + dr respectively. Suppose that after the action of 
some forces the body moves so that the points P, Q become P’, Q’. The linearity 
of space gives P’Q’ = PQ + QQ’—PP’ or, following the notation of the last section 
r’ =r+s'—s. For rigid motion we have that? r - dvp = 0 which implies dvp | r 
(instantaneous rotation about the point P) and from equation (20.18) we have!: 


r=r+s'-s=r+eoxrdt. (20.22) 


8Note that we are dealing with Newtonian Kinematics therefore t is common for both points. In 
Relativity this is not possible! 


°In the following we drop the index P from sp because P is arbitrary. 


'OWe omit the term 5V(r - wdt) because we are considering a rigid motion therefore the strain 
motion vanishes. 


20.3 The Motion of a Linear Newtonian Deformable Body in Terms of Geometry 721 


We compute the change in the magnitude of the displacement vector r. We 


have!!: 


=r +(@xr)-(@xr)dt? +2r- (@xr) =r4 [o”? —-(@- r)| dt” =r*(1 +” sin dt”). 


The r? is infinitesimal (recall that r stands for dr because we consider infinitesimal 
motions) therefore the term wr? sin? Od? is of fourth order and can be neglected. 
This implies: 


that is, during a rigid motion the relative (Euclidean) distance of nearby points in 
the deformable body remains invariant. We conclude then that: 


Rigid motion is generated by a Euclidian isometry, that is by the Killing vectors of the 
Euclidean metric. 


From Geometry we know that the (continuous) Euclidean isometries constitute 
a six dimensional Lie group, the latter consisting of the closed three dimensional 
Abelian subgroup of translations T(3) and the closed three dimensional subgroup 
of rotations O (3). This implies that we can understand (or decompose) the motion of 
a rigid body as a combination of two types of different transformations: Translations 
(corresponding to T (3)) and rotations (corresponding to O(3)). This is exactly what 
we have found by studying relative motion in Newtonian Physics! We may say then 
that during a rigid motion the space is “frozen” into the system. In other words the 
existence of rigid motions in Newtonian Physics is equivalent to the existence of the 
Euclidean metric! 

There two ways to look at this equivalence. 


a. To consider the background space to be the Euclidean space (that is, not simply 
the linear space R*) and understand the rigid motion as a motion of parts of space 
enclosing an Newtonian deformable body. This is the view taken in Newtonian 
Physics, that is, the geometry is considered to be a property of the space not of 
the motion and consequently of the structure of the deformable bodies. 

b. Our approach is different, in the sense that we consider the geometry as being 
a property of motion and not as a property of space. Let us analyze this a little 
further. 


The proof of this result is as follows. The term r - (@ x r) =0 because r is normal to (w x r). 
From the identity of vector calculus 


(A x B)-(C x D) = (A-C)-(B- D)—(A -D)-(B- C) 


we have: 


(@ xX r)-(@ xr) = wr (@- r)?? -_ re (1 cos 0) = @’r’ sin’ 6. 


722 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


We have defined the rigid body as an deformable body which is able to execute 
rigid motions only. We have shown that this type of motion is defined by means of 
the two vector fields (the vp, w) and that these vector fields generate two groups 
of motions, the translations 7(3) and the rotations O(3). We pose the following 
question: 


Is there a metric (that is a non-degenerate second order symmetric tensor) whose isometries 
(=Killing vectors) generate the Lie group with subgroups T (3) and O(3)? 


If the answer is “yes”, then we have defined a metric in the linear space R? via its 
group of isometries, and in a sense we have geometrized rigid motion. The answer 
as to the existence of such a metric will be given by observation. That is, we have 
first to observe in nature rigid motions and thus prove that such a metric exists. In 
this approach geometry is defined by physical observation and it is not an inherent 
property of space which modulates motion. In fact we do not need to give space 
special geometric properties, besides its linearity. In a way we are in accord with 
Newton, who in his celebrated book Principia defines space as: 


Absolute space, in its own nature, without regard to anything external, remains always 
similar and immovable. 


However, there is still a key point we have to address. This is the 
method of observation. As we have seen in Chap.2 each theory of Physics 
postulates certain means and procedures for observing motion. For example in 
Newtonian Physics it is postulated that one observes motion using an absolute stick 
of unit length and an absolute (i.e. universal to all Newtonian observers) clock. In 
Special Relativity it is postulated that motion is measured by means of light signals 
and synchronized clocks. Therefore the result of observation depends also on the 
method of observation! In that sense, the approach that the geometry defined by the 
relative motion in a theory of Physics and verified in the real “world” is applicable 
to that theory only, because the procedures of observation of motion are specific to 
that theory only. 

The beauty of this approach is that everything depends upon our definition of 
observation.!* There is nothing beyond and above the human power or beyond our 


'2 4 similar point of view seams to be taken in Cybernetics, although not justified so clearly. They 
say (see http://pespmc1.vub.ac.be/): 

“Among the most elementary actions known to us are small displacements “in space”. We have put 
the quotes, because people have accustomed to imagine that some entity, called “space” exists as a 
primary reality, which creates the possibility of moving from one point of this space to another. Our 
analysis turns this notion topsy-turvy. Only actions constitute observable reality; space is nothing 
but a product of our imagination which we construct from small displacements, or shifts, of even 
smaller objects called points. If x is such a shift, then xx — the action x repeated twice — is a double 
shift, which we would call in our conventional wisdom a shift at the double distance in the same 
direction. On the other hand, we may want to represent a shift x as the result of another shift x’ 
repeated twice: x = x’x’. It so happens that we can make three different kinds of shifts, call them 
X, y, Z, none of which can be reduced to a combination of the other two. At the same time any shift 
w can be reduced to a properly chosen combination of shifts x, y, z. So we say that our space has 
three dimensions.” 


20.3 The Motion of a Linear Newtonian Deformable Body in Terms of Geometry 723 


experience. Geometry is a human construction and it is dictated by our approach 
of motion and structure in general. There is nothing of a metaphysical nature in 
Physics! 

The observational fact that there do exist rigid motions in nature, when we 
observe motion with the specified Newtonian method, takes us to the conclusion 
that in Newtonian Physics rigid motion is equivalent to — or it can be understood in 
terms of — the celebrated Euclidian metric i.e. Euclidian Geometry. 


20.3.2. The Geometrization of Strain Motion 


Having related the rigid motion to the Euclidian geometry we continue with the 
other type of motion of a linear Newtonian deformable body, that is, the strain 
motion. As we have already shown the strain motion is described by the strain 
tensor (see (20.13)). The strain tensor being a symmetric second rank tensor it can 
be considered as a metric of signature —3 (i.e. a positive definite metric in R*) which 
is not flat, i.e. it is a Riemannian metric of non-vanishing curvature. Therefore, 
contrary to what is generally believed, the curvature (Riemannian Geometry) exists 
in Newtonian Physics; however it is hidden in another place (the strain motion) and 
not in the rigid body motion we usually work! 

It is possible to write the change of the squared relative deformation for small 
strains as follows: 


ast 
a (ax" + Eas") = Ydx" [8yv + eu] dx” (20.23) 
LL v pv 


and consider that the strain acts as a supplementary metric to the Euclidean metric. 

In this approach the strain can be considered as the deformable body’s internal 
metric that is, the geometry which is due only to the deformations of the body 
when the body is “freezed’’, in the sense that it does not suffer global translation 
and global rotation (global means all the points share the same property). This is 
the metric which must be used in the study of the so called rheological properties 
of a deformable body, that is the properties due only to the “static” deformations 
(=change of shape) of the body. 

Using that s“ = u“dt we obtain the rate of strain tensor 


1 Pas! as’ 
iw = 5 5 ies | (20.24) 


ax” | axl 


Note that the ‘rate of strain tensor’ does not mean change of the strain tensor in 
time but instead change of the connecting vector s“. 


724 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


20.4 The Geometry of a General Newtonian 
Deformable Body 


Having the results of the previous section as a guide we consider the case of a 
general i.e. not necessarily linear Newtonian deformable body. 

Consider within a general Newtonian continuum the points P, Q with position 
vectors x, x + dr where dr = PQ and assume that after the action of some forces 
the continuum changes its shape (is deformed) so that the points P, Q go to points 
P’, Q’ with position vectors x + s(x), x + dr + s'(x+ dr) respectively. From the 
linearity of the space (see also (20.18)) it follows that the distance r’ = P’Q’: 


r =dr+s'(x+ dr) — s(x). (20.25) 
The deformation is assumed to be continuous which means that: 
s(x + dr) = s(x)+ds(r) 
where s is the connecting vector field. To write this relation quantitatively we 
consider a frame of reference,!? ¥ say, in which the vectors r = x = (x! x, x3), 


dr = (dx!, dx”, dx) and s(r) = u(r)dt where u(r) is the velocity at the position 
r. Then 


v 


ra) Le 
ds(r) =) 5 dx” (20.26) 
v 


and the squared Euclidean distance r between the new points P’, Q’ is by virtue 
of (20.25) and (20.26): 


as 
dr? = 2 (ax" + Eas") 


Lb 


asl ash asl 
= V(dx")? 4250 (ax ax") + 3 ( . ax" ax"). 
a [Lv x 


pv.p \ Ox” axP 


The first term Yi(dx")? = dr’ is the squared distance between the original points 


Lb 
P, Q and corresponds to the rigid motion part of the deformation The remaining 
two terms give the change in squared distance between the points P, Q which is 
not covered by the rigid motion. This term is quadratic in dx”, therefore it can be 


'3The discussion is general and applies to linear spaces of any finite dimension. 


20.4 The Geometry of a General Newtonian Deformable Body 125 


manipulated to define a square matrix and subsequently a second order symmetric 
tensor. We note that we can write: 


asl ast asi 
as. (an ax”) + y ( ax" as”) 


pv LV, ox” oxP 
os as” as’ as 
= Ydx" dx” dx” — e 
» . (5 pT 9 =) ‘a ey (= =) = 


We define the extended strain tensor e,,, in terms of the connecting vector with 
the formula: 


Bb v P Osh 
wwnslantaet Dawa 027 
Then the deformation is: 
dr? = dr? + eyydx"dx”. 
where (summation convention understood) 
D(dr) = 2e,ydx"dx”. (20.28) 


is the non-linear change in the squared relative distance of two particles executing 
strain motion only. 

We say that the strains are small strains if the derivatives ie are so small that 
their products can be neglected. In this case the extended strain tensor reduces to the 
linear strain tensor or strain tensor for small displacements given by: 


acerca (20.29) 
e => ——— Ps 5 
wy" 9] ax¥ — axl 


The Newtonian deformable bodies, which move under small strains are precisely 
the linear Newtonian deformable bodies we considered earlier. 

Because in the following we shall be concerned with strain motions generating 
small displacements we shall restrict our discussion to linear Newtonian deformable 
bodies. To simplify the wording in the following we shall drop the word linear and 
say simply deformable bodies. Furthermore by strain we shall mean strain tensor 
for small displacements. 

Just in passing we may remark that the non-linear strains should be related to 
non-linear theories of motion as, for example, General Relativity. 


726 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 
20.5 The Stress Tensor of a Linear Deformable Continuum 


As we have shown the general motion of a (linear) deformable continuum consists 
of three parts: 


a. A pure translation r, which is a vector 
b. A pure rotation x(V X Vp) x r,dt which is a pseudo-vector or |-form. 


c. A pure strain 5V(r - vpdt), which is a symmetric 2nd rank tensor. 


According to Newton’s Second Law the first two types of motion are produced 
respectively from a vector (the force F) and a pseudo-vector or 1-form (the torque 
M). We expect that the strain motion will be produced by a form of Newton’s law, 
which will involve a symmetric second rank tensor. This new tensor we call the 
stress tensor. How the stress tensor is defined, what measures and how it is related 
to the strain tensor it produces, are questions which we address in this and the next 
sections. 

Simple experimental observations have shown that the deformation generating 
strain motion depends on force per unit area and not on the force alone. For example, 
a given load will extend a thin wire more than a thick one, because in the first case 
the area of the cross section is less and therefore the force per unit area greater. The 
deformation resulting by a strain motion, is due to the adjustment of the internal 
forces between individual particles, and the extend of this adjustment depends on 
the additional forces experienced by each of the particles affected — that is, on the 
additional force per particle. But each particle has a definite effective area, hence 
we have to consider the area over which it acts as well as the force itself, which 
implies that the additional force per particle is proportional to the external force per 
unit area. Therefore we define: 

Stress is the force per unit area regarded as being transmitted throughout the 
material of the deformable body at global rest (that is, after global translation and 
global rotation i.e. rigid motion have been removed). 

In SI units the units of stress ace N m~* = 1 Pascal (PA) ie. pressure and in 
the English system pounds per square inch (psi). Because Pa this is a rather small 
stress usually stresses are expressed as mega -Pascal (M Pa) or in English units Kilo 
pounds per square inch (ksi). 

Since the stress is a second rank symmetric tensor in a coordinate frame it can be 
represented uniquely by a square symmetric matrix whose elements are the stress 
components in that frame. We shall assume orthonormal (i.e. Cartesian) frames only 
(this is always possible in Newtonian Physics) and shall follow the notation: 


O11 O12 O13 
Loij] = | 021 022 023 |. (20.30) 
031 032 033 


The stress tensor may be considered as an additional metric entering in the study 
of the motion of a linear deformable Newtonian continuum. This metric geometrizes 


20.5 The Stress Tensor of a Linear Deformable Continuum 727 


the environment causing the strain motion of that continuum. It is assumed to be 
positive definite, that is, at every point of the continuum when expressed in the 
principal axes it has the form diag(b?, be, ba): 

A transformation A of the coordinates is an isometry of the stress metric if 
satisfies the relation [oj;] = A'[oi]A. 


20.5.1 Body and Surface Forces 


In order to describe the stress tensor in Newtonian dynamical terms we have to 
describe it in terms of forces. To do that we introduce two types of forces which are 
easily recognized from one another and are defined as follows. Forces which act on 
all volume elements of the deformable body which we call body forces. One such 
type of force is gravity. We designate body forces by the vector symbol b; (force 
per unit mass) or by the symbol p; (force per unit volume). If ¢ is the density of the 
deformable body we have the relation: 


pbi = pi. (20.31) 


The other type of force concerns the forces which act upon and are distributed in 
some fashion over a surface element, regardless of weather that surface is part of the 
bounding surface, or an arbitrary element of surface within the deformable body. 
These forces we call surface forces and denote with the symbol f; (force per unit 
area). Examples of this type of force are the contact forces, forces which result from 
the transmission of forces across an internal surface to the deformable body etc. 

Suppose that at a point P within a Newtonian deformable body the stress tensor 
is oj; and that we place a surface dS at P whose unit normal is n/. Then the stress 
vector at P is the vector!* 


ti, = ojjn/ (20.32) 


which is the surface force on the surface dS when placed at P. Different surface 
elements placed at P will lead to different stress vectors at P. 

At the point P of the deformable body we consider the orthonormal frame 
{x1x2x3}, draw a cube and represent the components of the stress tensor as shown 
in Fig. 20.3. In drawing the components we follow the convention that a positive 
stress component is represented with an arrow in the positive direction of one of 
the coordinate axes while acting on a plane whose outward normal also points in a 
positive coordinate direction. All the stress components displayed in Fig. 20.3 are 
positive. 


'4The formula ti = ij n/ is known as Cauchy stress formula. 


728 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


X3 
033 
O30 
031 O03 
O20 
Oar Xo 
043 : 
O12 
O14 


xy 


Fig. 20.3. The Newtonian stress tensor components 


The three stress components shown by arrows acting normal to the respective 
coordinate planes and labeled o;; i = 1, 2,3 are called normal stresses. The six 
arrows lying in the coordinate planes and pointing in the direction of the coordinate 
axes, namely oj; i # j = 1,2,3 are called shear stresses. Note that the first 
subscript (the i) identifies the coordinate plane (that is, the plane normal to the i- 
axis) on which the shear stress acts, and the second subscript (the j) identifies the 
coordinate direction in which it acts. 


20.5.2 The Perfect Deformable Body 


Consider a linear Newtonian deformable body at global rest which under the 
influence of a stress executes strain motion. Assume that the deformable body is 
homogeneous, that is, it has the same properties at all of its points. This means that 
the deformable body is invariant under the translations generated by the group T (3). 

Due to the strain motion the shape of the deformable body changes and as a result 
various forces develop in its interior. To study these forces we immerse a small flat 
surface element of area dS at various points and study the net force exerted on the 
surface. There are two possibilities: 


a. The net force acting on the surface dS is normal to the plane of dS. 
b. The net force to the surface element dS is inclined with respect to the surface 
(that is, it is not collinear to the normal vector on the surface dS). 


20.5 The Stress Tensor of a Linear Deformable Continuum 729 


In the first case if the surface element dS moves parallel to the plane of the 
surface, it does not consume work i.e. it moves “freely” into the deformable body. 
A deformable body, in which the force of the stress on an elementary surface dS 
placed at all its points is normal to the surface for all directions of dS, we call 
a perfect Newtonian deformable body. Obviously the physical properties of a 
perfect deformable body are isotropic, that is, invariant under the action of the group 
of rotations O(3). 

In order to compute the stress required to produce a strain motion in a perfect 
linear Newtonian deformable body we consider a point P within the deformable 
body and define at P three orthogonal axes {x, y, z}. If we place an elementary 
surface dS normal to the x — axis the corresponding force will be along the x—axis, 
that is, we have F, = (pxdS,0,0) where p, is the pressure on the surface dS. 
Similarly the force on the elementary surface dS' placed normal to the axes y, z are 
F, = (0, pydS, 0), F, = (0,0, pzdS) respectively. Because a perfect deformable 
body is isotropic (that is invariant under the action of O(3)) all directions must be 
equivalent therefore we must have!> p, = Py = Pz = p. The common pressure p 
at P (the value of p in general is different at different points within the deformable 
body) we call the isotropic pressure. We conclude that the stress tensor producing a 
strain motion of a perfect Newtonian deformable body is described by a scalar field 
(the isotropic pressure). Furthermore we note that the three forces F,, Fy, F, can 
be described altogether with the diagonal matrix diag(p, p, p) so that the explicit 
form of the stress is the symmetric second rank tensor pd,,y. 

In terms of Geometry this means that the metric of stress tensor of a perfect linear 
Newtonian deformable body is conformally related to the (flat) Euclidian metric 
the conformal factor being the isotropic pressure p. This observation can be very 
important when one studies geometrically the dynamics of a perfect deformable 
body. 

The following points are of special interest in the case of perfect Newtonian 
deformable bodies. 


a. A perfect Newtonian deformable body is characterized by a mass density p (say) 
whereas its reaction with the environment is characterized with the isotropic 
pressure. The fact that the stress tensor is defined independently of the mass 
density opens the possibility to consider perfect deformable bodies whose mass 
density vanishes. Of course in Newtonian Physics such material continua do not 
exist, however they do exist in relativistic Physics. One such deformable body 
consists of photons (it is known as photon gas) whose dynamics is studied e.g. 
one computes the pressure it exercises on a given surface. 

b. Another extreme situation is to consider a perfect Newtonian deformable body 
for which both the mass density and the pressure vanish. Such a material is 
“absolute”, in the sense that no other material is possible to interact with it. This 
deformable body may be considered as the empty space. In this approach the 


'SThis is the experimental result of Pascal with perfect fluids. 


730 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


empty space is considered to be a definite physical system and not a pathetic 
substratum where all types of motion take place. This cannot be said for the 
time whose nature is different and must be considered as a device which we use 
to “count” the sequence of events (i.e. change of state or motion) of physical 
systems in the physical environment. 


20.5.3 The Imperfect Deformable Body 


Real Newtonian deformable bodies are not perfect. That is, when an elementary 
surface is placed inside a Newtonian deformable body tangential components of 
stress (i.e. forces) always exist. The immediate consequence of the existence of 
tangential forces, is that the property of the equality of the pressure in all directions 
at every point no longer holds. 

To study the new situation we consider again at an arbitrary point P of the 
deformable body three orthogonal axes {x, y, z} and place the elementary surface 
dS normal to the x- axis. In this case the force on the surface will not be normal 
to the surface and it will have in general components along all axes {x, y, z}. 
We write in an obvious notation Fy, = (pyxdS, pyydS, pxzdS). Similarly the 
forces on the elementary surface dS placed normal to the axes y,z shall be 
Fy = (pyxdS, PyydS, pyzdS) and F, = (pz,dS, pzydS, pzzdS) respectively. The 
components pyx, Pyy, Pzz are normal to the surfaces on which they act, while 
the rest six denote tangential components. For example p,ydS is the force in the 
direction y on the area dS perpendicular to the axis x. We shall consider the symbols 
Pxx+ Pyy, Pzz to be positive when they represent tensions (that is, the force is along 
the outward normal to the surface), so that a pressure is to be regarded as a negative 
stress. Because the surface dS does not rotate the total moments on it must vanish, 
therefore we must have!® Pzy = Pyz and similarly for the other two principal axes. 

We summarize the above considerations as follows (see Fig. 20.3 for explanation 
of the indices): 


1. In an arbitrary orthogonal coordinate frame {x,y,z}, the three forces acting 
on the elementary surface dS placed at any point P in the interior of a linear 
homogeneous Newtonian deformable body undergoing a strain motion, define 
the symmetric matrix: 


Pxx Pxy Pxz 
Suv =} Pyx Pyy Pyz 
Pzx Pzy Pzz 


'6 As we have said the strain is the “motion” of a Newtonian deformable body which remains after 
the global translation and the global rotation have been removed. Therefore under strain motion 
only the shape of the body changes. Because we have no rotation the stress tensor which produces 
the strain must be symmetric (otherwise it would produce couples, therefore rotation about the 
center of mass.) 


20.6 Classification of the Stress Tensor of Linear Newtonian Deformable Bodies 731 


This matrix defines (in the frame used) the components of a two index symmetric 
tensor (possibly degenerate), which we call the stress tensor of the Newtonian 
deformable body. It is this tensor which causes the strain motion of the imperfect 
linear Newtonian deformable body. 

. For a perfect linear Newtonian deformable body we have: 


Pxx = Pyy = Pzz =P; Pxy = Pxz = +++ = Pry = 9. 


and the stress tensor Oy) = Pdy that is, the stress metric is conformally related 
to the space Euclidian metric, the conformal factor being the isotropic pressure p. 
. As was the case with the strain tensor, the stress tensor defines the stress ellipsoid, 
has stress principal axes etc. Furthermore the stress tensor can be considered 
as a metric on the real space R*, which geometrizes the interaction of the 
environment with the linear Newtonian deformable body. If this inetraction is due 
to a force field (as in the case of the gravitational field) this metric geometrizes 
this force field." 

. Fora general motion of a linear Newtonian deformable body there are associated 
the following three positive definite metrics: 


a. The Euclidean metric which geometrizes the rigid motions 

b. The strain metric which geometrizes the strain motions 

c. The stress metric which geometrizes the interaction of the environment with 
the linear Newtonian deformable body. 
The first metric is flat and independent of the deformable body defining (or 
depending on) the geometry of the space where motion occurs. The remaining 
two metrics are related respectively to the material nature of the Newtonian 
deformable body and its interaction with the environment. In general they are 
not flat and can be degenerate. If for a perfect Newtonian deformable body 
both the strain metric and the stress metric are not degenerate then they are 
conformally related to the Euclidian metric, that is they satisfy the relation 


eCuv = Btu _ BP by (20.33) 


where B is a characteristic function of the material. Note that time does not 
enter in this relation therefore the relation between strain and Stress is not a 
dynamical one (although there does exist a relation cause effect). 


20.6 Classification of the Stress Tensor of Linear Newtonian 


Deformable Bodies 


For the purpose of classifying the interaction of Newtonian deformable bodies with 
the environment there have been recognized three types of elementary stress and 
strain. 


'7This observation takes us closely to the case of General Relativity. 


732 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


Fig. 20.4 Longitudinal stress 


20.6.1 Longitudinal Stress 


The longitudinal stress (see Fig. 20.4) concerns one dimensional strain and stress. 
and results from the application of an external force F along the direction of the 
one dimensional deformable body of cross sectional area A. It is given by the ratio 
F/A. If the application of the stress results in a strain 6x (elongation) of the body 
and the body has length x before the application of the stress, then the measure of the 
longitudinal strain is given by 5x/x. The Young modulus E (or Y) is the pressure 
defined by the ratio: 


F/A 
E= : 
bx /x 


(20.34) 


In practice there are no one dimensional bodies therefore the longitudinal stress is 
an idealistic situation. In reality we have one dimensional bodies with thickness e.g. 
in the form of wires. In this case it is (experimentally) observed that the continuum 
suffers two changes: 


— longitudinal extension and 
— lateral contraction. 


This strain motion requires the definition of an additional parameter, which will 
encounter for the lateral contraction of the deformable body. This new parameter is 
the Poisson ratio which is defined as follows: 


lateral contraction per unitlength = dy/y 


o= = = (20.35) 
longitudinal elongation per unit length bx /x 


The Poisson ration is dimensionless. 


20.6.2 Shear Stress 


Shear Stress (see Fig. 20.5 ) concerns two dimensional strain and stress; 

Shear are tangential forces applied to the planes of the areas concerned, and 
produce a shear stress acting through the linear Newtonian deformable body. If 
F is the force causing the strain motion and the surface of the area of application 
of the force is A, then the shear stress equals F'/A. The shear strain produced is a 
lateral displacement per unit length of the surface of the body in the direction of the 


20.6 Classification of the Stress Tensor of Linear Newtonian Deformable Bodies 733 


Fig. 20.5 Shear stress 


applied force and it is measured perpendicular to the direction of the force. From 
Fig. 20.5 we have that the shear strain equals = An equivalent measure of the shear 


strain is the angle of shear @ defined by tané = The ratio (pressure): 


shear stress F/A 
shear strain 0 


(20.36) 


is called the modulus of rigidity. If 1 is a constant we say that the continuum is a 
perfectly elastic material. 


20.6.3 Bulk or Volume Stress 


Bulk or volume stress concerns three dimensional strain and stress (Fig. 20.6). 

When uniform pressure is exerted equally from all sides e.g. as in a solid 
immersed in a static perfect fluid, then we have the development of a stress acting on 
the deformable body, which we call bulk or volume stress. If the pressure exerted 
in this case is p N/m? then the volume stress transmitted throughout the material 
of the body is p N/m?. As a result of the volume stress we have a contraction of 
the volume of the body from V to V — dV. The quotient 5V/V we call the bulk or 
volume strain. The quotient (pressure): 


volume stress Dp 
k= = : (20.37) 
volume strain bV/V 


we call the bulk modulus or shear modulus of the material of the deformable body. 
The reciprocal of the bulk modulus k we call the compressibility of the material. 

In general the strain motion of a linear Newtonian deformable body in real world 
is different from the ideal types encountered above. However the strain motions 
we are interested in practical situations (in general) can be understood/described 
by combinations of these three simple types of stress. Furthermore it is possible to 
device experiments in which only one form of the above types of strain predominates 
and thus one can measure the values of the quantities E,o,n,k. 

All stresses have dimension Force/Area = Pressure, that is ML~!T~? and all are 
measured (in SI units) in Nm~? (Pascal). Similarly all strains are dimensionless 
quantities. 


734 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


Fig. 20.6 Bulk stress tie tee ee er aor 


In general all four coefficients o, Y, and k are required in order to describe the 
strain motion of a Newtonian deformable body under the influence of a given stress. 
However for special cases there are relations amongst o and the stress coefficients 
Y,n and k. For example, for a linear isotropic Newtonian elastic body one has: 


E E 


Therefore for this type of Newtonian deformable bodies only two of the four 
coefficients suffice for the description of their strain motion. From the above 
relations we also conclude that for these Newtonian bodies the Poisson ratio o : 


— Cannot exceed +5 because k would be negative and 
— Cannot be less than —1 because then n will be negative. 


Therefore —1 <o < 5. 


20.7 Worked Examples on the Newtonian Stress Tensor 


We recall some basic results from the theory of real symmetric matrices. 
The eigenvectors or principal directions n; of a square matrix 7;; are defined 
by the requirement: 


Tjjn | An i 
where the scalars 4 are called the principal values or the eigenvalues of 7;; for the 
corresponding eigenvector. The eigenvalues are determined from the characteristic 
equation: 
det (Tj; = A6i;) =0 
which upon expansion gives: 


= pd? + Tip. — II = 0. 


20.7 Worked Examples on the Newtonian Stress Tensor 735, 


The coefficients I7,,/I7, III; are called the first, the second and the third 
invariants respectively of the tensor 7;;. These quantities are defined in terms of 
the invariant (trace) TrT = T} as follows: 


Ip =TrT =T} 


Ur = 5 (7 Ti — Ti Tj) ~ [@rrry? = Tr(T?)| 


Nile 


II Iy = detT. 


Results on eigenvalues: 


1. For a symmetric tensor with real components the roots of the characteristic 
equation are all real and the eigenvectors are real. In a Euclidian space the 
measure of the eigenvectors is positive, therefore without loss of generality we 
can take them to be unit. 

2. The eigenvectors of two distinct eigenvalues are mutually perpendicular. 

3. If all eigenvalues are distinct the principal directions are unique and mutually 
perpendicular, therefore they define an orthonormal Cartesian frame. If two 
eigenvalues are equal, then there is only one direction associated with the third 
eigenvalue, which is unique. The remaining two can be any two directions in 
the plane normal to that direction, which are mutually perpendicular. If all 
eigenvalues are equal then every set of right-handed orthogonal axes qualifies 
as principal axes and every direction is a principal direction. 


For every symmetric tensor with real components there is an associated sym- 
metric matrix with real entries. When we write the tensor in the frame defined by 
the principal directions this matrix is diagonal. If the components of the tensor are 
given in another frame then there is always a transformation A, say, relating that 
frame with the frame of principal axes. This transformation is called a similarity 
transformation, is not unique and it is given by the formula: 


Tt =A'TA (20.39) 


where 7* are the components of the tensor in the principal frame and ¢ denotes 
transpose. 

Concerning the principal values — that is the entries of the diagonal matrix — we 
have the following results: 


— The principal values and the principal directions of T and T‘ are the same 

— The principal values of T~! are the reciprocals of T and the principal directions 
are the same 

— The product tensors TQ and QT have the same principal values 

— A symmetric tensor is said to be positive (negative) definite if all its principal 
values are positive (negative); and positive (negative) semi-definite if one 
principal value is zero and the others positive (negative). 


736 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 
Example 20.7.1 Consider the second rank symmetric tensor T whose characteristic 


equation is: 
(3—A)(6—-—A)A—-A)=0. 


There are three distinct eigenvalues the A; = 3, Az = 6,43 = 1. 
To determine the principal direction for the eigenvalue 43 = 1 we use the 
equations: 


(Ty — by)njy = 0 


Nayhwi = 1. 


Hy 
We assume ni p= {7 and find from the first two equations: 
n3 
4n, + 2n2 =0 
2n; +n2 =0 
from which follows n, = nz = 0. The second equation implies then n; = +1 hence 
0 
nay= | 0 
+1 
Working similarly with the other two eigenvalues we find the remaining two 
42. sl. 
or ae ies 
principal directions n(2) = te | "a =| +75 
0 


0 
From these results it follows that the transformation matrix which diagonalizes 


the original matrix is: 


Ni) 0 : 0 ; +1 
A=] = +z +20 

a =| 

13) Fa +72 0 


which identifies two sets of principal direction axes, related by a reflection wrt the 
origin. It is easily verified the A is Euclidian orthogonal (i.e. satisfies A'A = J), 


which should be expected.!® 


'8The principal directions of a positive definite metric for different real eigenvalues are normal (in 


the Euclidian sense) to each other. 


20.7 Worked Examples on the Newtonian Stress Tensor 737 


Exercise 20.7.1 Consider the second rank symmetric tensor represented by the 
matrix: 


5 1 «2 
[T]l=]1 5 v2 
J2 /26 


. Show that the characteristic equation is 4° — 1647 + 800 — 128 = 0. 
b. Prove that the eigenvalues are 4.1) = 8,42) = AB) = 4 
c. Show that the principal direction corresponding to the distinct eigenvalue (1) = 
1 
8 isn) = 5 1 
J2 
d. Concerning n(2) choose any unit vector perpendicular to n(\). An obvious choice 
-1 
isn) = a 1 . Finally define ng) = nq) X nN) and show that ng) = 


2 


—1 
1 
x|—-1 
2 
e. Show that the matrix: 
1 14 
2 23s 
1 1 
fore aes ere 
2 2 Va 


diagonalize the matrix [T]. 


Example 20.7.2 The components of the stress tensor o;; at a point P of a continuum 
are given by the following symmetric matrix: 


21 -—63 42 
Loi; ] => —63 0 84 
42 84 —-21 


Determine: 


a. The stress vector on a plane through the point P having the unit normal h = 
+ (26, — 3€ + 663) 

b. The stress vector on a plane through the point P parallel to the plane defined by 
the three points A(1, 0, 0), B(O, 1, 0) and C(O, 0, 2). 


738 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


Solution 
; 21 —63 42 69 
a. The stress vector t” = [n] [ojj] = 5(2, —3,6) | -630 84 =| 54 
42 84 —-21 —42 


or t = 698; + 54€) — 4283. 

b. The equation of the plane through the points A, B,C is AQ - (BA x CA) = 
0 where Q(x1, x2, x3) is an arbitrary point of the plane. This implies that the 
equation of the plane is: 


xy —1 x2 x3 


—1 1 0 = 0 > 2x, + 2x2 +43 =2. 
-l1 0 2 
2 


The normal to this plane is | 2 | (this can be found directly!? form BA x CA) 


) 
al 


2 
so that the unit normal [n] =3|,2]- Hence the stress vector at P on this plane 
1 


t 


2 21 —63 42 —14 
a at 1 
t; = [n] [oj] = . 2 —630 84 — | —14 
1 42 84 —21 77 
or in vector form ie = —14@; — 14€. + 776s. 


We note that the stress vector at the same point P is different for two different 
planes passing through the same point. 


Example 20.7.3 The matrix representation of the stress tensor at the axes {x1x2x3} 
at the point P is: 


13 
loi] = 31 0 
20-2 


Compute the components of the shear tensor in the frame {x}x,x4} at the point 
P, which are obtained from the x;x2x3 by rotation at 45° counterclockwise about 
the x3 axis. 


'°Or use the equation of the plane in vector form n - (tr — ro) = 0. 


20.7 Worked Examples on the Newtonian Stress Tensor 739 


Solution 
The transformation matrix relating the two frames is: 


cos@6 sind 0 
A=] —sin@ cosé 0 
0 0 1 


where 6 = 45°. Replacing 6 we find: 


1 10 


A=-—~]|-110 
v2 0 01 


The components of [o;;] in the new basis are [o;;] = Aloij]A‘ = 
4 0 2 

0 -2 -/2 

J/2 —/2 -2 


Example 20.7.4 The components of the stress tensor at a point P of a continuum 
wrt the axes {x1x2x3} are given by the following matrix: 


570 24 
[oj] =] 0 500 
240 43 


Determine the principal stresses and the principal stress directions at P. 


Solution 
The characteristic equation of the matrix [o;;] is found to be: 


(57 —0)(50 — 0 )(43 — 0) — (24)? (50-0) =0 > (50—0)(o — 25)(o — 75) = 0 


therefore the principal stress values are o(1) = 25, 0(2) = 50, 0(3) = 75. These are 
the eigenvalues of the stress tensor. 
Note: Verify the invariance of the trace (=I,): 25 + 50 + 75 = 57+ 50+ 43. 

The principal stress values being different there are three different principal stress 
directions at P. For the principal stress: 


o(1) = 25 we find the principal direction hq) = 26 1 36; 
0(2) = 50 we find the principal direction hw) = +é» 
0(3) = 75 we find the principal direction ng) = 36 1 28 


740 20 The Physics of Newtonian Deformable Bodies: Newtonian Fluids 


The transformation matrix which transforms the matrix [o;;] to its principal axes 
(i.e. diagonalize the matrix) is: 


3 4 
50 +5 
A= | 420 
4 3 
5 5 


Note: Verify that Afoj;]A" = diag(25, 50, 75) = [o}1]. 


Chapter 21 ®) 
The Stress: Strain Relation for Elastic om 
Newtonian Deformable Bodies 


An important type of Newtonian deformable bodies are the ones in which the stress 
is a homogeneous function of the rate of strain. That is, when the stress and the rate 
of strain vanish simultaneously. The simplest such bodies are the ones for which the 
stress is a linear homogeneous function of the strain, that is the following relation 
holds: 


ine Y, vei (21.1) 
A,p 


where Y fe is a tensor (the coupling tensor).The dimensions of ase are [T]. These 
Newtonian deformable bodies we call elastic Newtonian bodies. 

The coefficients Ve are defined solely by the physical properties of the elastic 
body and are called the elastic coefficients of the body. In general Y, i. has 34 = 81 
components. However due to the symmetry of the strain and the stress tensors the 
independent components of the coupling tensor reduce to at x a = 36. 

This number of parameters is still too excessive for the level of the present book. 
Therefore we have to simplify things further and we consider the special class 
of linear elastic bodies, which are defined by the requirement that their physical 
properties are the same at all their points. Mathematically this is stated by the 
condition: 


Ye=0 (21.2) 


or, equivalently, that the elastic coefficients are constants. Relation (21.1) with 
ae =constant is known as Hooke’s law. The name is unfortunate because it is 
not a law of Physics, but rather a ‘selection rule’ (i.e. a simplifying assumption) 
amongst the elastic continua. 

In order to reduce further the number of independent coefficients of a linear 
elastic body, we specialize our considerations to the linear isotropic (Newtonian) 


© Springer Nature Switzerland AG 2019 741 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_21 


742 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


elastic bodies. which are considered to be the linear elastic bodies for which their 
physical properties are isotropic i.e. invariant under the action of the group SO(3). 
Mathematically this is expressed by the requirement that the tensor Y, 7 is isotropic 
in the 3-D Euclidian space. The generic isotropic tensor of order two! in the 
Euclidian space is the 4,,,. It can be shown that the most general isotropic tensor 
of order four is of the form: 


Yiktm = Adjkd1m + Bbj1dkm + CSimSxi 


where A, B, C are scalars (of the group S.O(3)) and in our case constants. Then we 
have from (21.1): 


Yapyvew” = [Adrpduv + BdrpSpv + Cdrvdpp | e"” 
= Abie, =e Bey + Ceo = (B+ C)erp + Aer Op 


where in the last step we have used that the strain tensor é,,, is symmetric. If we set 
p =B+C and A = X this expression becomes’: 


hip = Pee FAS wed eee (21.3) 
t 


where )é,, is the trace of €.v(an SO(3) invariant) and p, A are two constant 


soatisicnts: which have to be determined. We note that the additional requirement 
of SO(3) symmetry reduces the number of independent components of Y, a from 
36 to 2 (the constants p, A). Equation (21.3) can be inverted to give the strain in 
terms of the stress as follows: 


Cnv = Ptuv + ESdpv ter (21.4) 
t 


where ¢, & are two new constant coefficients with dimensions [T~!]. 

The value of the constant coefficients 9,4 and $,& is independent of the 
particular type of stress and the resulting strain and are characteristics only of the 
material of the linear elastic isotropic body. Obviously only one pair of parameters 
is required to be computed, the other pair being calculated form the inverse 
relations. The computation of the independent pair of parameters will be done from 
relation (21.3) or (21.4) provided we know both the stress and the resulted strain. 

In the following section we compute these coefficients for a linear elastic 
isotropic body. 


'This means that this tensor is an irreducible representation of the group SO (3). 


?This result holds for any symmetric second order isotropic tensors, not only for the strain and the 
stress tensor. 


21.1 The Stress Rate - Strain Relation for a Linear Elastic Isotropic Body 743 


21.1 The Stress Rate - Strain Relation for a Linear Elastic 
Isotropic Body 


In order to compute the parameters ¢, A and ¢, € in terms of the Young modulus EF 
and the Poisson ratio o for a linear elastic isotropic body we consider at an arbitrary 
point inside the body the principal axes of strain or stress. Because the material 
is isotropic these axes can be rotated and still remain principal axes. We choose 
these axes? so that both the strain tensor €,» and the stress tensor t,,, are diagonal. 
We assume that é,) = diag(e11, e22, €33) and t,, = diag(t11, t22, 33). In order to 
calculate the effect of the stress ¢,,, we apply three independent longitudinal stresses 
along each one of the principal axis and then we sum the result. We note that the 
rate of strain motion along the x—axis is: 


é11 = bx/x + dy/y + b2/z 


where the strain motions 6x/x + dy/y + 6z/z are due to the stress t,; along the 
x—axis. A similar result holds for the rate of stress and the stress for the other two 
principal axes. 

We start with the longitudinal stress along the principal x —axis (recall that along 
a principal axis the stress is only longitudinal!). This stress creates a strain motion 
which is elongation along the x—axis by 6x and contraction along the other two 
axes by dy and 6z. This strain is produced by the stress component f,;. We have for 
each part of the strain motion the relations: 


1. Strain motion along the x-axes: 6x/x = zt (by the definition of E) 

2. Strain motion along the y axis: dy/y = —o(6x/x) = —Ft (by the 
definition of a) 

3. Along the z axis: 6z/z = —o (6x/x) = —$t (by the definition of o). 


Similar results hold for the stresses f22, #33 applied along the directions y, z 
respectively and the resulting stain motions. Therefore we have: 
Strain motion along the x—axis: 


7 1 
e=z [t11 — o(t22 + #33)] 


Strain motion along the y—axis: 


7 1 
622 = [t22 — o (t33 + t11)] 


Here we have two positive definite metrics which are diagonalized simultaneously, that is, their 
principal axes coincide. See footnote 5 Chap. 2. 


744 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 
Strain motion along the z—axis: 
_ 1 
233 = 5 [t33 — o(t11 + f22)]. 


As we remarked above the total strain under the action of the total stress (i.e. 
tuy = diag(ty1, t22, 133) ) along the principal axes is the sum of the strain motions 
along each principal axis. Therefore for the total rate of stain motion we have: 


. 1. 
euv = pias (t11 — o (t22 + 133), to2 — 0 (t33 + C11), 033 — O(t11 + f22)) 


1. Oo... 
guise (t11, f22, 133) — 7 iias (t22 + £33, 133 + t11, ti + f22) 


aes (t t t )+ —dia (t t t )-=—t 
l ’ : l ? ? 
E § 11, /22, 133 E § 11, 422, 133 E ? 


from which follows: 


2 l+o oO 
eCuv = Ele = Foe Litpp: (21.5) 
p 


This is the general relation relating the applied stress and the resulting rate of strain 
motion in a linear elastic isotropic body. Comparing with (21.4) we find that for a 
linear elastic isotropic Newtonian body: 


ee ee (21.6) 


¢ E ° E 


where F,, o are the Young modulus and the Poisson ratio of the body. 


Exercise 21.1.1 Let Trace(é) = )°é and Trace(t) = \°tpp. Show that for the 
p p 


pp 
case under discussion the following relation holds: 


Trace(t) = Trace(é). (21.7) 


E 
1 —20 


Using this result show that the relation between the strain and the stress for a linear 
elastic isotropic Newtonian body is: 


E oO 2 
tuv — Lin E + ap duvTrace@)| (21.8) 


21.1 The Stress Rate - Strain Relation for a Linear Elastic Isotropic Body TAS 


Conclude that for longitudinal stresses, the parameters p, i of equation (21.3) are 
as follows: 


_ E ae Eo 
ae? “(1 ej = 2a) 


p (21.9) 


Having computed the relation between the stress and the strain for longitudinal 
stress applied to a linear elastic isotropic body in terms of the parameters E, 0 we 
compute the shear modulus coefficient n and the bulk modulus coefficient k for 
these materials, by considering a shear stress and a bulk stress acting on such a 
material and using (21.6). 


21.1.1 The Coefficients n, k of a Linear Elastic Isotropic Body 


We start with the shear modulus coefficient n = iA where @ is the angle of shear. 


We consider a tangential stress as in Fig. 20.5 and find the stress tensor: 
0 t120 


tuy = t1720 O 
0 0 0 


where f}2 = F The rate of the strain tensor“ per unit length is (for small angles 6): 
A Pp g & 


0 6/20 

éuv =| 0/20 O 

0 0 0 
We note that Trace(t) = Trace(é) = 0 therefore the stress strain relation (21.6) 
for such strains becomes ty) = ew. Replacing t,,,, €,» from these expressions 

we find: 
1 

Hao ie 

(where tj2 = F/S) therefore the shear modulus coefficient of a linear elastic 


isotropic body is: 


t12 E 
n=—= : (21.10) 
6 l+o 


4The non-vanishing component is the os that is, the xy and yx component only. 


746 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


We continue with the bulk modulus coefficient of a linear elastic isotropic 
Newtonian body. Let p be the isotropic pressure resulting from the application of a 
bulk stress. Then the stress tensor is given by the relation: 


tuv = —pbyv- (21.11) 


Replacing in (21.6) we compute that the resulting strain tensor is: 


7 —-pi+o) oa (1 — 20) p 
euv = E z' 3p] Suv = — 3S pe: (21.12) 
For strain motion due to bulk stress we have: 
AV 1-2 
bal re ee cme (21.13) 
V E 


Therefore the bulk modulus coefficient k of a linear elastic isotropic body is: 


Dp E 


k= Kvyjv > 30-20)" 


(21.14) 


21.1.2 A Second Derivation of the Rate of Strain Stress 
Relation form Hooke’s Law 


We consider a stress ¢;; acting on a given linear elastic Newtonian body producing 
the strain é;;, As we have seen, at an arbitrary point of the body along the principal 
directions of strain the rate of strain reduces to translations and the strain tensor is 
diagonal é;; = diag(e1,e2, e3). Hooke’s law states that: 


In a linear elastic body the rate of strain produced in a definite direction is proportional to 
the stress (=force per unit area) in that direction. 


Therefore we infer that the stress tensor 7;; must also be diagonal in the strain 
principal directions or, to put it in an equivalent way, the principal axes of strain and 
stress coincide. We write tj; = diag(t, t2, t3). 

Here the geometric assumptions end and we have to take Physics into account. 
As we have already mentioned it is an experimental fact that the application 
of longitudinal stresses produce extensions, measured by the Young’s modulus, 
and transverse contractions measured by the Poison ratio. Assuming the material 
to be isotropic, so that we have a linear elastic isotropic body, these transverse 
contractions are characterized by the same Poisson ratio. 

We consider an arbitrary force acting on the linear elastic isotropic Newtonian 
body and analyze it in three components along the principal axes of the strain tensor. 
These produce a longitudinal rate of strain along each direction, each strain being 
independent of the other two. Therefore the relation between the stress tensor and 


21.1 The Stress Rate - Strain Relation for a Linear Elastic Isotropic Body 747 


the rate of strain tensor (per unit length of material) — along the principal axes of 


strain only — is>: 
7 1 o l+o o 
én = 5t — re + #33) = BT rae + t22 + 133) 
ieee! Pete pa ee (21.15) 
E E E E 
l+o 


7 1 fon 
633 = —3 — —(ha +h) = 


lox 
133 — —(t t t 
E E EB 3 zu t 22 + 133) 


or, in tensor notation: 


a l+o fon 
Cup = tan _ gu +t) + 3) duy w=1,2,3. (21.16) 


We define the invariant®: 


oO 
Yes a Mit tat 3) (21.17) 
and relation (21.16) is written as: 
1 
Cup = i + Y w= 1,2, 3. (21.18) 


We need to emphasize that formula (21.16) applies only at a point P because 
the principal directions of stress in general vary from point to point. That is, in 
general there does not exist a simple coordinate system in which the stress and 
the strain components are related as in (21.16). However this presents no difficulty 
because since the stress and the rate of strain are tensors we may compute their 
components in any frame if we know them in one or, to put it in an equivalent way, 
equation (21.16) is covariant. Indeed let us consider a point Q far from P and let the 
frame we choose at Q to be {x}. If we denote the coordinate frame of the principal 
directions at P_ with {x“} then we have the coordinate transformation: 


x" =altx” (21.19) 


where the Jacobian matrix [a,\"] satisfies the orthogonality condition: 


alta? = 8°. (21.20) 


5We use the same E, o because the continuum is isotropic, therefore the scalars on the surface of a 
sphere centered at the origin (here the origin of the principal directions of strain/stress) must have 
the same value. 


®This is an invariant because the quantities E, o being tensors are invariants. 


748 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


As it is well known this condition is a result of the Euclidean character of the space. 
It is also well known that under a Euclidean (or orthogonal) transformation the trace 
ty +t + f3 is invariant.’ In the coordinate system at the point Q this becomes: 


eee ee re (21.22) 


In component form (21.22) is written in terms of corresponding (symmetric) 
matrices as follows’: 


~ ~p I+o 4/ l+to4/ Ito ,4/ 
en 12 &13 Et ty 3 tio iE 3 
pod =F pai: “PO gf “PO of 
* €59 &93 «| =] * E ty th ah; 
symmetric * &3, symmetric * He th, t+w 


(21.23) 


In order to invert relation (21.22) we compute the trace VW = e11 +e22 +33 in terms 
of the invariant w. From (21.22) we have in an obvious notation: 


a pg ye w (21.24) 
—  &£E ~ I — 1 , 
Then (21.22) gives: 
t! Eg owes ae eee (21.25) 
= e =. ——— rs: . 
we ae Jeni fae l* tae 


This expression coincides with relation (21.8) we derived in the previous section 
using a more mathematical method. 


21.2 The Dynamics of Strain Motion of an Elastic Body 


21.2.1 The Force Due to a Stress Tensor 


In Newtonian dynamics we have only one law: Newton’s Second Law. Therefore 
in order to write down an equation of motion for a general strain motion we need 
to compute the force which creates the stress. To do that we consider at a point P 
of the body an element of mass dM whose surface area is (say) S. We consider an 
element of surface do on S and assume that the force on the mass dM due to the 


7One can prove the invariance directly using the transformation rule (21.20). Indeed we have: 


Doinw = Do abate, = Y2 8 Mtor = D> top (21.21) 
Mw Mh Mh mn 


8As we have already remarked the isotropy implies only that the tensors are symmetric, not 
diagonal! 


21.2 The Dynamics of Strain Motion of an Elastic Body 749 


rest of the material of the body is dF. We also choose a coordinate frame in which 
both the rate of stain and the stress tensor at P are diagonal. Then we know? that 
along each principal direction of the strain, the x! say, the stress (=pressure) f1 is 
due to the action of the projection of the force dF along direction x! acting on the 
projection do! of the surface!° do, that is dF, = tjdo'!. Because the stresses along 


the three principal directions are independent we write in that frame!!: 


dF, = tyydo” (21.26) 


where ty) = diag(t1, tz, 3) and do’=(do}, do?, do). The force on the whole 
surface S is: 


F = f indo". (21.27) 
S 


Applying the divergence theorem we write: 


F ii) tu. vdV (21.28) 
Vv 


where V is the volume enclosed by the surface S. The importance of the above 
expressions is that they are covariant, hence even if they have been derived in a 
special coordinate system at P (the principal axes of the stain and the stress tensors) 
their validity extends to all coordinate systems, the explicit computation in a specific 
coordinate system is done by means of the transformation (21.20). Because the 
volume is infinitesimal we conclude from (21.28) that the density of force f“ per 
unit volume is the divergence of the tensor t,,,, that is: 


fh = rH, (21.29) 


21.2.2. Newton’s Second Law in Terms of the Displacement 
Vector 


Before we write Newton’s second law for strain motion we have to make clear 
what we are looking for. As we have repeatedly mention the strain motion is the 
residual of the motion of a deformable Newtonian body after the rigid motion 
(global translation and global rotation) have been removed. This means that the 


°This is a key assumption! 


10The (infinitesimal) surface do can be associated with a vector n normal to the surface whose 
length equals do. This is the vector do”. 


‘Tn general the density of force in the direction specified by the vector A” is given by tyyA”. 


750 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


strain motion is not a “usual” motion, that is, change of position, but a motion which 
involves only the “change of shape” (i.e. deformation) of the body. Now the change 
in the shape of a body is described by the displacement vector field, which is a 
vector field defined throughout the material of the body. Therefore to write Newton’s 
second law for strain motion means to write down the equation, which governs the 
evolution of this vector field in time under the action of a given stress. This point 
must be clearly understood if we are going to understand properly the Physics of 
strain motion. The above imply that we have to express the strain tensor in terms of 
the relative displacement vector (or connecting vector) s” that is, equation (20.24) 


. 1 
euv = 57 Suv a Sv) 


in which the rate of strain tensor is expressed in terms of the derivatives of the strain 
vector s“. We shall also use the invariant V = ae = VS. 

In the following we apply Newton’s Second Law for strain motion for a linear 
elastic isotropic body. To do that we recall that for such a deformable body the stress 


and the rate of strain are related by the following expression: 


E 20 op 
tuy = W1+o) Say + Sv + [235 ow : (21.30) 
Exercise 21.2.1 Show that in terms of the displacement vector s = (u, v, w) the 
components of the stress tensor of a linear elastic isotropic Newtonian body are 
given by the entries of the following symmetric matrix: 


E du o E du dv E du ow 
Ito E a 1—20 vs| 2(1+oc) ( + au) 2(1+o) \ dz a ox 


re _ E dv o E dv ow 
[tuv] = ee fa lay 1 Tae “S| Dae) \ ae Fay 
E dw o 
* * 16 [s+ -S5Vs| 


(21.31) 


Now it is an easy task to compute the density of force due to the stress (i.e. the 
forces due to the other particles of the body) in terms of the derivatives of the strain 
vector s/. Indeed by taking the divergence of (21.30) and using (21.29) we find!?: 


E Oo 
Me Kb vi [L vy, 
hi I +6) Ee ae ia i FZ, &.v) 
x 1 
~ 20 +0) ee + 1x35 ony} (21.32) 


The material is assumed to be locally homogeneous and isotropic therefore E, 0 are constants 
during the strain motion. 


21.2 The Dynamics of Strain Motion of an Elastic Body 751 


In standard vector notation this relation is written as follows: 


Sa alll ig 
~ 20 +0) 


i= V-(V 5) (21.33) 


Exercise 21.2.2 Show that the components of the force density of a linear elastic 
isotropic Newtonian body are given by the entries of the following matrix: 


E 2 
Mite) |V ut 7 =a Ox 2-(Vs) 
[ft = GS) V+] =m ay 79) (21.34) 
E 
WI-+o) V*w 7 =r de 7 (Vs) 


where Ss = (u,v, W). 


Now suppose that on a linear elastic isotropic Newtonian body of mass density 
p(t,r), acts an external force with volume density R“. Then Newton’s Second 
Law!? yields for the strain vector: 


R* + f# = ps", (21.35) 


or by (21.32): 


bh 


9G 6) [: wot 79g Sav) ‘| = 8h (21.36) 

We take the partial derivative s”,, because the strain refers to a specific point 
in space and the body is assumed to be under no global translational and/or global 
rotational motion. In standard vector notation equation (21.36) becomes: 


R+ a Vos+ 
21+) 


as 


v-(V | = Pa: (21.37) 


1 —20 


21.2.3. Kinematic Interpretation of the Equation of Motion 
of the Strain Vector in a Linear Elastic Isotropic 
Newtonian Body 


Before we discuss the kinematic interpretation of the equation of motion of strain 
we need the following result from vector calculus. 


'3The rhs is mass density times acceleration and the lhs is density of force. We apply Newton’s 
Second Law in the form: Total force density = mass density times acceleration. Acceleration is 
the second order time derivative of the strain vector. 


752 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


Suppose we have a differentiable vector field f and we want to write it as the sum 
of two other vector fields, one of which has divergence zero and the other curl zero, 
that is: 


f=VxA+vV¢ (21.38) 


where VA = 0. To find this decomposition!* we introduce the new vector field w 
with the relations: 


1 1 
g=-—V-w, A=—Vxw. (21.39) 
Ar An 


We note that this vector field satisfies the requirement VA = 0. To compute the 
vector field w in terms of f we replace A and ¢ from (21.39) in (21.38) and find: 


1 1 
f= an x (vxew)=s_ VV -W). 


Using the identity of vector calculus: 
Vv x (Vxw) = V(V- w)—-V?w (21.40) 
we get: 
Vw = —4rf. (21.41) 


This decomposition of f is called decomposition into the sum of a solenoidal and 
an irrotational vector field. 


'4We give a second proof in terms of tensors. We have: 
fl =e Ay) +H 
and we assume that the vector 
Ay = von wo"* 
so that A‘, = 0 and 
g= —wsG 
hence &yox¢° = 0. The vector field w” is an auxiliary vector field. Then we have: 


fit = oH opp we — (we) = — (88 — 82H) w™" — (wy 


wer 4 (wy (wi) 4 = wh. 


The above remain true if we consider a constant in front of the definition of A,, @. 


21.2 The Dynamics of Strain Motion of an Elastic Body 753 


Exercise 21.2.3. Express the vector field f = yzit+-xzj+ (xy — xz)k as a sum of an 
irrotational and a solenoidal vector field. [Hint: The answer is 


wry=f ff a (21.42) 


where r is the distance of the element of integration dv from the point P..] 


Let us consider now a non — dissipative!> scalar wave in the x — y plane traveling 
down the x-axis with velocity V and possessing a wave profile y = f(x) at time 
t = 0. Then at any time ¢ the profile of the wave is y(x,t) = f(x — Vt), that is 
the form of the wave is propagating with velocity V as the wave propagates. The 
equation of propagation of the wave is found as follows: 


ay a fx-Vt) ay _ p22 fe — Vt) es a*y 1 d*y 
ax2 ax2 ate at? ax2 V2 ar2 


Let us consider next the equation of motion of the strain and assume that the 
external force vanishes, so that the equation of motion reads: 


= Toe 
21+0) 


nieeale d*s___a’s 
Pap Pap 


21.44 
1 —20 ( ) 


We use now the result above that is, that every!° vector field can be decomposed 
into a sum of a solenoidal and an irrotational vector. Let s;, S2 be these vectors, that 
is S = S$; +S2 where V -s; = 0 and V xsz = 0. Because equation (21.44) is linear in 
the variable s we assume (this is not an innocent assumption but it is “reasonable” 
since the body is linear and isotropic and the stress strain relation is linearized!) that 
it is satisfied by both s;, s2 independently (this is an extra restriction on $1, $2!). This 
assumption implies the equations: 


Peep oe (21.45) 
tite "ae "se 
0289 
2 
eo ly GHe Sp. 21.46 
aol +75 ( s»)| Pa ( ) 


'SNon-dissipative means that the wave preserves its form as it propagates. By this we mean that if 
at time ¢ = 0 its form at the origin is f(r) where r is the position vector of a point in space — this is 
a snapshot or form of the wave — then at a time ¢ the wave (snapshot) at the pointris f(r — Vr) 


where V is the velocity of the wave. 
'6To be precise a vector f is uniquely determined if its divergence and curl are known throughout 
all of space, provided that f tends to zero like + as r + oo. See for example Vector and tensor 


Analysis by H. Lass McGraw-Hill 1950. 


754 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


However from the identity (21.40) we have: 


x (VX$2) =V(V + $2)—V*s2 = 0 = V(V - 2) =V7Sp. 
Replacing in the second equation we find: 


E(1-o) . 


0255 
ee ee as 21.47 
(ieiti=te) oe en 


Equations (21.46) and (21.47) are wave equations for the vectors $1, S2. 


Equation (21.45) defines a transverse wave moving with speed V; = OCeEye 
Equation (21.47) represents a longitudinal wave which is propagating with speed!’: 
EQ —-o) 
Vi = ; (21.48) 
pd +o)C — 20) 


Let us prove that the second wave is longitudinal. We assume that the wave is 
traveling along the x-axis. Then, because the wave is assumed to be non-dissipative 
we have: 


Oy 
Ox 


3 
82 = 82(x Vt) = u(x—Ve)it v(x -Vojtw(x—Vok SVxs2 = > i+ 
x 


so that ow = ay = 0 which means that v, w are independent of x — Vt. We are not 


interested in constant displacements therefore sz =u(x — Vt)ii.e. the displacement 
S2 is parallel to the direction of propagation of the wave i.e. the wave is longitudinal. 
Working similarly we prove that the wave s, is transverse (Exercise). 
In general both types of waves are present and the net strain is found from their 
composition. 


Exercise 21.2.4 Show that if the external forces on the linear elastic isotropic 
Newtonian body are negligible (i.e. the body moves ‘freely’) and if the body is in 
a state of equilibrium then the strain satisfies the following ‘equation of motion’: 


v7s+ 


i (21.49) 


Exercise 21.2.5 Write equation (21.49) in case the strain is radial, that iss =s(r)r. 


'7We compute the velocity from the propagation equation of a non dissipative free wave, that is: 


1 02s 

2 

V’s-— sy = 0. 
V2 ar 


21.2 The Dynamics of Strain Motion of an Elastic Body 755 


Exercise 21.2.6 A coaxial cable is made by filling the space between a solid core 
of radius a and a concentric cylindrical cell of internal radius b with rubber. If 
the core is displaced a small distance axially, find the displacement in the rubber. 
Assume that end effects, gravity and the distortion of the metal can be neglected and 
the body is a linear elastic isotropic body. 


21.2.4 The Dynamic Equation of Motion for a Linear Elastic 
Isotropic Newtonian Body 


In a linear elastic isotropic Newtonian body the stress tensor has the form t,) = 
PSyy Where p is the isotropic pressure. For this body we find that the divergence 
tuv,v = P,u- This implies the following: 


a. The rate of change strain is a solution of the equation: 


i (1 — 20)p 
uy = ———— bw - 
E 

that is, in a linear elastic isotropic Newtonian body one can apply only bulk 
stresses and can have strain motion, which is isotropic expansion or contraction 
only. For example a spherical elastic isotropic body cannot change shape under 
the action of isotropic stresses and can either contract to a smaller sphere or 
expand to a larger sphere. 

b. The equation of motion for a strain motion of a linear elastic isotropic Newtonian 


body is: 
R* + p* = pat (21.50) 


where a” is the “acceleration” of the displacement vector at a point inside the 
medium. As we shall show (see (22.4)) the time derivative of any vector field f is 
given by the formula: 


df of 

— =5- -V)E 

dt ot a eN 
where v is the first time derivative of the position vector (i.e. the velocity). Using 
this we write the equation of motion (21.50) as: 


ov 


R+vp=0(> 


+ (Vv: vy) (21.51) 


756 21 The Stress: Strain Relation for Elastic Newtonian Deformable Bodies 


or, using the decomposition (19.8): 


1 dv 1l_4 
—(R+ Vp) = —+=VWw -—vx(V XY). (21.52) 
p ot 2 


Exercise 21.2.7 Show that the term v x (V X Vv) = 2V x w or in covariant form 
(v x (V x v))¥ = —2e""P vy» where es"? is the Levi Civita antisymmetric symbol 
and we have used (20.7). Then prove that the equation of motion (21.51) can be 
written as follows: 


1 dv! 1 av? 
cee 1d eer aa 21,53) 


We shall use this expression when we study the motion of perfect fluids. 


Chapter 22 ®) 
Newtonian Fluids om 


The Newtonian fluids are special types of linear Newtonian deformable bodies 
which we shall define in an exact physical sense presently. Practically every 
Newtonian deformable body may be considered as a fluid with certain properties. 
Therefore it is better that we continue with the term fluid in a vague manner until we 
end up with a definition. For the time being we consider a fluid to be an aggregate of 
a great number of particles (a continuum) which moves under the mutual interaction 
of its particles and the application of external forces. 

Historically there have been two methods in the description of the physical 
properties of a Newtonian fluid, the Lagrange approach and the Euler approach. 


a. The Lagrange approach. 
In this approach one describes the physical properties of a fluid in time by 
monitoring a measuring probe which for convenience we consider to be a single 
fluid particle. Each particle, the A say, is labeled at the moment fo by its position 
vector ra(0) or any other set of three independent parameters. The position 
of the specific particle in space the moment f is given by its position vector 
ra(t) = (ra(O),t) where ¢ is common to all particles due to the absolute 
nature of time in Newtonian Physics.! Therefore the Lagrangian approach in 
the description of the physical properties of a fluid uses the congruence of the 
trajectories of the fluid particles in space. From a geometric point of view these 
trajectories are the integral curves of the Lagrangian velocity field v4 of the fluid 
in space which is defined as v4 = drat | 

b. The Euler approach. 
In the Euler approach one considers at a fixed point in space an elementary 
volume where are placed various probes which monitor the corresponding 


'This is not possible for relativistic fluids where universal time does not exist and one has to 
introduce the concept of synchronization. For this reason we shall consider relativistic fluids 
separately. 


© Springer Nature Switzerland AG 2019 757 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_22 


758 22 Newtonian Fluids 


physical properties of the fluid at that point. Due to the Newtonian Equivalence 
Principle the physical properties of the fluid are described by Euclidian tensor 
fields. Therefore in the Euler approach the fluid motion is described by the time 
evolution of relevant tensor fields defined at each moment across the fluid. For 
example in the space occupied by the fluid the time moment fo the matter density 
is 0(to, r) where r is position vector of any point across the fluid. 


The difference between the Eulerian and the Lagrangian approach is that the first 
focuses on the monitoring of the physical fields across the fluid whereas the second 
considers the monitoring of the physical properties of the fluid along one flow line 
of the fluid. 

Let us give a first definition of the fluid as a working tool. 


Definition 22.0.1 A Newtonian fluid is a linear Newtonian deformable body 
which: 


a. Occupies a restricted connected region of space 

b. It cannot perform rigid motion alone, but it can perform strain motion alone. It 
can also perform strain and rigid motions simultaneously. 

c. Its physical properties are described by a number of Euclidian differentiable 
tensor fields the most important and fundamental being: 


a. A scalar field p(t, r) which we call matter density or simply density 

b. A vector field u(t, r) which we call the Eulerian velocity field or simply the 
the velocity field which describes the velocity of the fluid at all its points at a 
given moment f. 


22.1 Mathematical Concepts Relevant to Fluids 


Consider a vector function f defined over the four-dimensional space t,x, y, z 
where ¢ and x,y,z are independent coordinates, which are functions of f¢ i.e. 
x(t), y(t), z(t). In this case formula (19.9) gives: 


of a 
df = (dr- V)f + —dt = | — -V | fdt. 22.1 
(dr Mae late (22.1) 


where udt = dr. We introduce the operator: 


S +u-V (22.2) 
—— u- 7 
Dt ot 

and write: 


df(r(t), t) _ Df 


pts): (22.3) 


22.1 Mathematical Concepts Relevant to Fluids 759 


We note that the operator a gives the total change that is, the change due to both 
the time coordinate an the space coordinate. The operator a is called the mobile 
operator for reasons which will be clear below. 

As an application of the mobile operator we consider its action on the position 
vector r of a moving particle. We have: 


Dr dr | 
Di dt" 
where u(r, ¢) is the Eulerian velocity field of the fluid. 


Similarly for the acceleration we find if we take f = u in (22.3): 


Du ou 
= -V)u. 22.4 
ansT ry + (u-V)u (22.4) 


Using (19.8) we write the acceleration in the alternative form: 


= PN ae (Vv xu) (22.5) 
a= ae ee 5 U ux x U). * 


We shall use this result to write Newton’s Second Law for the motion of an 
accelerated particle of the fluid. 
Note that the equation: 


Df 
wo 


for any function f(r(t), t) implies that f is a constant of motion of the fluid. 

The equation (u- V) f = 0 which arises frequently implies that f is constant 
along a streamline. It is emphasized that this equation offers no information about 
whether f is a constant. Suppose for instance that the flow is everywhere in the 
x—direction so that this equation reduces to or = 0. This equation says that f is 
independent of x, but it does not constraint how f depends on y, z ort. 


Exercise 22.1.1 Write the acceleration in the form encountering the rotating 
coordinate systems (i.e. Coriolis acceleration etc.). 


Proposition 22.1.1 The mobile operator £ defines a derivation in R", that is ,it 


is R—linear and satisfies the Leibnitz rule. The derivative defined by f we call the 
total derivative or (especially in fluid dynamics) the material derivative. 


760 22 Newtonian Fluids 


Proof R—linearity is obvious. Concerning the Leibnitz rule we prove only the case 
of the product of a vector S with a scalar and leave the general proof to the reader. 


D(pS) _ (pS) 
on + (u- V)(pS) 


a as 
a Ms +p + Stu. V)p + plu-V)S 


Do DS 


= S — 
Dt PD 


Oo 


We justify now the name mobile operator. Consider a tensorial field S(r, t) which 
describes a physical or geometric property of the continuum at the point P. We 
consider a change 6¢ int and an induced change dr = u6dt in r. Then the change? of 
the value of S(r, ft) (at the point P!) is given by the difference: 


6S(r, t) = Sir + udt, t + dt) — S(r, ft). (22.6) 
Expanding S(r + udr, t + 5¢) in terms of dt at the point P we find: 


as. oas 
dS(r, t) = E + wr dt + O(St7) 


= E +(u- vs | dt + O(St7) = BY + O(5t7). (22.7) 
ot Dt 


It follows that for any tensor field S: 


DS 6S 
— = lim —. (22.8) 
Dt — 5t0 bt 

which justifies the name mobile operator for Zz. Let us see an application of the 


mobile operator. 


Example 22.1.1 A fluid flows from infinity along the x—axis towards the origin 
with constant speed U and passes a fixed sphere of radius R centered at the origin. 
Given that the velocity field of the fluid at the point with position vector r is 
u = —U(1 + R?r—7)i + 3R?r->xUr (U =constant) compute the acceleration of 
the fluid at the point r= bi(b > R) and evaluate the maximum value |amax| of its 
magnitude for the various values of b. 


?We use 5t and not dt because this change is measured at the same point P of space. Furthermore 
we use ér and not dr because the change in r is defined in terms of the change in f. If it was 
independent of the ¢ then we should use dr! 


22.2 Classification of Fluids and Fluid Flows 761 


Solution 
Because the velocity field is independent of time (in such cases we say that the 
motion of the fluid is steady) the acceleration is (see (22.4)): 


_ Du 


=a, = (u- V)u. 


At r = bi the velocity u= (2R3b-> — 1)Ui therefore u- V =Uy, where u = 


(2R3b-> — 1)U (this observation saves us form computing all derivatives of u). 
We compute: 


ou 3-4 0Fr 3 _6 or 5 Or 

ale Rr R: ip sic 

= ‘ 3 ax a ea G Sr as xr = 
But or =i, & = 2, so that atr = bi we have: 


> Ox 


aj 
oY = -6U 3b 


Therefore the acceleration at the point r = bi is: 
a = 6U7(b° — 2R° Rb Vi. 


The maximum value of the magnitude |a| occurs when (07 — 2R?R3b-") = 0 
and (b3 — 2R?R*b~’) < 0. These conditions imply b = (72)!/3R and finally 
lamax| = 9(4)"/3U7/R. 


22.2 Classification of Fluids and Fluid Flows 


Kinematically the fluids are classified by the properties of the (Eulerian) velocity 
field. We consider the following types of flow of a fluid. 


¢ The flow of the fluid is called irrotational iff there is a scalar field @(f, r), called 
velocity potential, such that u = —V¢@. In this case f u- dr = 0 for any closed 
smooth path which lies entirely into the fluid. 

¢ The flow of the fluid is called steady or stationary iff au = 0 that is, the velocity 
field (as a vector) at a point in space is constant in time. If the motion of a fluid 
is not steady we call it unsteady. 


Dynamically the fluids are classified in terms of the dynamic fields describing 
the physical properties of the fluid. We consider the following basic types of fluids. 


762 22 Newtonian Fluids 


¢ A fluid is called a dust if it is characterized dynamically by one scalar field, the 
matter density p(t, r). Physically a dust is an aggregate of particles which do not 
interact, that is the forces in the interior of the fluid (1.e. the stress) vanish. 

¢ A fluid is called homogeneous (in space) if the matter density p(t, r) =constant 
in space. 

e A fluid which has the properties of a perfect deformable material (that is, if we 
place a small surface anywhere in the fluid and in any direction the stress force 
on the surface is always normal to the surface) resulting in an isotropic pressure 
p we calla perfect fluid. Dynamically a perfect fluid is characterized completely 
by the scalar fields p(t, r) and p(t, r). During the flow of a perfect fluid all stress 
(i.e. internal) forces are normal to the displacement of the small surface (they do 
not consume energy) therefore a perfect fluid cannot conduct heat and cannot be 
stirred. Furthermore it is isotropic but not necessarily homogeneous. 

e Any fluid which is not a perfect fluid we call an imperfect fluid. Physically an 
imperfect fluid is a fluid in which the stress forces on a small surface anywhere 
in the fluid are not normal to the (Eulerian) velocity field of the fluid, therefore 
imperfect fluids can conduct heat and can be anisotropic. Furthermore imperfect 
fluids require more tensor fields than the mater density p(t, r) and the pressure 
p(t, r) in order to be described dynamically. 


22.2.1 Geometric Description of the Flow of a Fluid 


In a fluid we consider two types of lines. 


a. The lines of flow or pathlines which are the trajectories in space of the individual 
particles of the fluid or, equivalently, the integral lines of the particle velocity field 
v and 

b. The streamlines, which are the integral curves of the (Eulerian) velocity vector 
field u(t, r) of the fluid. In general the pathlines and the streamlines do not 
coincide. The condition for these to coincide is au = 0 that is, the flow of fluid is 
steady. If the flow is not steady then the streamlines form a continuously changing 
pattern. 

A stream surface in a fluid is a surface in space with the property that at every 

of its points the (Eulerian) velocity u(t, r) is tangent at that point. Obviously a 
stream surface consists of streamlines. The intersection of two stream surfaces in 
the fluid gives a stream line. 


Given a closed curve in the fluid a stream surface is formed by drawing the 
streamline through every point of the curve. A stream surface whose cross sectional 
area is infinitesimally small we call a stream filament. 


22.2 Classification of Fluids and Fluid Flows 763 
22.2.2 Calculating the Streamlines 


The streamlines are the integral curves of the (Eulerian) velocity field. From the 
definition u = a we have that the vectors u and dr are parallel, therefore the 
quotient of their components in any frame of reference is the same. This implies 
that if in an arbitrary frame we have the analysis u =u, (t, r)it+ uy(t, r)jt+uz(t, rk 
and dr = dxi-+ dyj+dzk then the streamlines (in that frame!) are given by the 
solution of the system of simultaneous equations: 


dx dy a 
Ux(t,r) Uy(t,r) uz (t, 4)” 


(22.9) 


This type of systems of simultaneous equations are called Lagrange systems. 

A first integral of these equations must be of the form f(¢,r) = constant. 
This equation defines a stream surface in the fluid. Its intersection with a second 
independent solution, g(t,r) =constant say, of the Lagrange system gives a 
streamline in the fluid. 


Example 22.2.1 The (Eulerian) velocity field of a fluid is u=ixr + 
cos atj + sinatk where a is a constant (~ 1). Find the streamlines and the pathlines. 
Discuss the special case a = 0. 

Solution 

We write r = xi+ yj+zk and compute i x r = —zj+yk. Therefore the velocity 
field is: 


u = (—z+cosat)j + (y + sinat)k. 


The Lagrange system corresponding to this velocity field is: 


dx dy = dz 
0 -—ztcosat  y-+sinat’ 


One solution of the system is x = constant, that is one family of stream surfaces 
are planes normal to the x—axis. To find a second family of stream surfaces we 
consider the last two terms and write: 

(y + sinat)dy = (—z+cosat)dz. 
Integrating we find the second family of stream surfaces: 
y? +274 2y sinat — 2zcosat = G 
where G is a constant. This is a new family of stream surfaces which consists of 


elliptical cylinders with axis the x — axis. The streamlines are the intersection of 
these two solutions/ families of stream surfaces. If a # 0 the streamlines form a 


764 22 Newtonian Fluids 


continuously changing pattern, the motion being time dependent. If a = 0 then 
the flow is steady (because in that case u = (—z + 1)j+yk is independent of 
t) and the streamlines are fixed circles given by the equations x =constant and 
y? +2*—22=G. 

The pathlines — that is the trajectories of the particles of the fluid — are given 
by the functions x(t), y(t), z(t) and are found from the solution of the system of 
equations: 


dz 
= 0, =-—z+cosat, —=y+sinat. 
dt di a? 
The solution of the first is x =constant. To solve the other two equations we 
differentiate the first and replace a from the second. We end up with the equation: 
d’y 


We +y=-—(a+ 1)sinat. 


Assuming that a 1 the solution is: 
y(t) = Acost + Bsint + C sinat 


where A, B are arbitrary constants and C = 1/(a — 1). Replacing this in the second 
equation we find: 


z(t) = Asint — Bcost — Ccosat. 


In the special case a = 0 we find easily y* + (z — 1)? =constant so that the 
pathlines coincide with the streamlines as expected, because in that case the flow is 
steady. 


Example 22.2.2 Ina coordinate system the velocity field of a fluid is given in terms 
of its components as follows u = =e) v= ae w = O where c is a constant 
(~ 0) and r denotes distance from the z—axis. Prove that the stream lines are 
given by the equations x” + y? = constant,z = constant, w = 0. Also prove 
that the velocity field admits a velocity potential and then show that the function 
Ox, y, = —c* tan7! (2) is a velocity potential. 
Solution 
The velocity v of the fluid is: 
OVing oR 
u=-— Poe, + er J. 


The stream lines are the solution of the system of equations: 


dx dy dz 


22.2 Classification of Fluids and Fluid Flows 765 


The solution of the first equation is x* + y? =constant, and the solution of the 
last equation is z =constant. These two equations are the equations of the stream 
lines of the fluid. 

To show that there exists a velocity potential, @ say, it is enough to show that 
V x u = O. This is left as an easy exercise to the reader. To compute the velocity 
potential @ we write (r? = x* + y* + 2”): 


The second equation gives: 
¢=—c | Zay = -c tan! (2) + CH. 
r2 x 


Replacing in the first equation we find that C(x) = const. = C. The constant is not 
important for the velocity potential and finally we have that ¢ = —c? tan7! (). 


Example 22.2.3 Prove that @ = xf(r) (r? = x* + y* + 2’, f’ $ 0) is a possible 
form for the velocity potential of an incompressible fluid motion. Given that the 
fluid speed u + 0 as r — ov, deduce that the surfaces of constant speed are given 
by (r2 — 3x2)r—® = constant. 

Solution 

The velocity u of the fluid is: 


V-u=-Vfi as r-v(*2) 


=X. G + st), 
r 


The condition for an incompressible fluid is V - u=0. This gives the condition 


(f' # 0): 


766 22 Newtonian Fluids 


The solution of this equation is f(r) = — ¢Ar—3 +B where A, B are integration 
constants. For this value of f(r) we find that the velocity of the incompressible fluid 


1S: 
A . Ax 
u= 3.3 By)i ze r. 


The values of the integration constants are fixed from the asymptotic conditions. 
We note that when r — oo the speed u — —Biso that B = 0 andu= Ali 3tr). 


2 50) 223 2 
The speed u? = (4s) (1 ons + “ ) = is ;(r—3x)* therefore u* = constant 


on the surfaces r~°(r — 3x)* = constant. 


22.3. The Equation of Continuity: Conservation of Mass 


The equation of continuity is a constraint equation, which expresses the requirement 
that the total mass of a fluid in a region within the fluid is conserved. 

In order to derive the equation of continuity we consider a simply connected 
region R of space within the mass of the fluid, with volume V which is bounded 
by a (smooth) surface S. We also consider an elementary volume dV at a point P 
inside R with position vector r at the time moment fp. The mass within the volume 


V at the moment fo is [ dM = f (to, r)dV where p(t, r) is the density of the fluid. 
R R 
As the fluid moves the mass (in the same region R!) changes. The overall change* 


is “ fama ® ay, This change in mass consists of the following parts: 
R R 


— The mass which leaves and enters the region R . If at the point P of the surface S$ 
(enclosing the region R) the velocity of the fluid is u and the element of surface 
is dS, this change is given by (see Fig. 22.1): 


dM dl 
— =-p— -dS=-—pu-ds. 
dt Pat a 
The overall change is — $ pu - dS where u - dS < 0 for mass entering the region 


R and u-dS > 0 for jis leaving R. 

— The mass which is created by sources or destroyed in sinks within the volume V. 
We measure this mass with a function w(t, r)dV, where we assume y(t, r) to be 
positive for sources and negative for sinks. The overall change due to sinks and 
sources is f y(t, r)dV. 

R 


3We enter 4 into the integral because the integrating region is independent of time f. 


22.3 The Equation of Continuity: Conservation of Mass 767 


Fig. 22.1 Conservation of 
mass 


Boundary of R 


Conservation of mass in the region R is expressed by the equation: 


i 20 ay = wit, dV ~ § pu as. (22.10) 
ot R 
R S 


From Stoke’s theorem (divergence theorem) we have: 


§ pu-ds = V(pu)dV. 
R 


S 


Because R is arbitrary, equation (22.10) implies: 


de 


ag tow =4G8) (22.11) 


or in terms of the material derivative 


OP = Wit.) (22.12) 
—= ,r). : 
Dt 
This equation is called the continuity equation (for matter). 
The continuity equation simplifies for various special types of fluids as fol- 
lows: 


a. If there are no sinks or sources, y(t, r) = 0 and the continuity equation reads: 


BY Sg6 22.13) 
Fs teed Gal (22. 


b. If in addition to (a) the fluid is incompressible, then o =constant and equa- 
tion (22.13) becomes: 


Vu =0. (22.14) 


768 22 Newtonian Fluids 


c. If in addition to (a) and (b) the velocity of fluid is irrotational, then there exists a 
velocity potential @ such that u = —V¢@ and the continuity equation becomes: 


v7 =0 (22.15) 


which is Laplace’s equation. 


In an incompressible fluid AV = 0 when Ap ¥ 0, that is, a quantity of mass 
within a volume* V comoving with the fluid moving under the influence of external 
forces (resulting in a gradient of pressure) is possible to change its shape but not its 
volume, which remains constant. 


. . _ —2xyz s (x2—y?)z ° y , 
Exercise 22.3.1 Show that the velocity field u = G2ty22h + G2pyyed + rer k is 


a possible motion for an incompressible fluid. Is this motion irrotational? 
Hint: Prove that Vu =0. Check if V x u=0. 


Exercise 22.3.2 Write the continuity equation in spherical and cylindrical, coordi- 
nates. 


Answer 
Spherical: 
dp 10 3 3 1 a 
ey — 0) + —— — = v(t,r, 0,0). 
a eae aay eT cag age eee 
(22.16) 
Cylindrical: 
D2 A 2 Gye Gin SUE Kee) (22.17) 
— +-—(ewr)+- — = r ; 
af rar’ ae ag wees 


Exercise 22.3.3 If the velocity of a fluid is radial, that is, u = u(r, t), show that the 
equation of continuity is written as follows: 
0p dp 


pd4 
=wi(t,r). 22.1 
we tue + Fu) =H) (22.18) 


Solve this equation for an incompressible fluid if W(t,r) = 1/r? and show that 
il 


=F 
Solution 
The continuity equation in spherical coordinates is given by (22.16). In the case 
of radial velocity (that is, ug = ug = 0) it takes the form: 
dp 


oe op ee =| 
ot r2 Peal ) i 


4A terminology for this volume is material volume. 


22.4 The Equation of Motion of a Perfect Fluid 769 


where we have replaced w(t, r, 0, @) with 1/r?. 
In the case the fluid is incompressible p =constant and the last equation reduces 
further to: 


a 1 
—(pur?) =1> u(r) = —. 
or pr 


22.4 The Equation of Motion of a Perfect Fluid 


We consider a perfect fluid which is acted upon by an external force of density f 
(that is force per unit mass of the fluid). We consider an elementary volume dV in 
the fluid with surface area S enclosing amass dM = pdV of fluid. We assume that 
Newton's second Law applies, that is, the change of momentum of the mass dM in 
time equals the total applied external force. Under these assumptions we derive the 
equation of motion of the mass dM. 

First we compute the total force on the mass dM. This force is due to 


a. The stress force (i.e. the force from the remaining mass of the fluid on the 
boundary dV of dM) and 

b. The external forces applied to the fluid as a whole (these forces are called body 
forces). 


The stress force is given by: 


- § pas = - f vpav = —vpav 
S=adV dV 


where the minus sign is due the fact that we consider dS to point outwards of the 
surface S$ and p is compression and, furthermore, we have used the fact that the fluid 
is perfect therefore the force is normal to the surface dS. 

Concerning the second force this equals fd M where f is the force per unit mass 
(=force density). Therefore the total force on the mass dM is fdM — Vp dV. The 
momentum of the mass dM is dMu hence Newton’s Second Law gives: 


D(dMu) 
dt 


= (pf — Vp)dV. 
The term? Days =dM pu = pitd V and we obtain finally: 


Ze ory, (22.19) 


5The mass dM remains constant in the volume dV as the fluid moves due to the conservation 
of mass and the assumption that the volume dV is comoving i.e. at all times contains the same 
number and type of particles. 


770 22 Newtonian Fluids 


This is Euler’s equation of motion for an elementary material volume within a 
perfect fluid. To find Euler’s equation in terms of the standard derivatives we replace 
PY from (22.5) and find: 


du l_ 4, 1 
— — -Vu'+ux(V xu) =f-—Vp. (22.20) 
or 2 p 

Now let us assume that the external force is conservative with a potential 
function x and that the fluid is homogeneous and the flow is incompressible (hence 
p =constant). Then the rhs of (22.20) is written —V(x + F) and the equation of 
motion for an incompressible fluid in a conservative force field is: 


3 1 
“_ux(vxw=-v(y te -w). (22.21) 
ot p 2 


If in addition the motion is: 


a. Irrotational, then V x u = 0 and the equation of motion reduces to: 


ou 
ot 


= P 1 4 
=-V(x+—4+ =) (22.22) 
O° 2 


b. Steady, then a = 0 and the equation of motion reduces as follows: 


1 
ux(Vxw=V(x + ui + sh) (22.23) 
p 


From (22.23) follows that for a perfect fluid in steady motion: 
ee 
u- VX +7 + au )} =0. (22.24) 
p 


This equation implies that the Eulerian velocity field u (not the velocity field of a 
fluid particle!) of a perfect fluid in steady motion is always tangent to the surface 
xt+ : + sue =constant, therefore this surface is a stream surface. We have proved 
the following result. 


Proposition 22.4.1 (Bernoulli Theorem) For a homogeneous perfect fluid of 
density p under pressure p, which flows with stationary (or steady) incompressible 
flow under the action of conservative (external) forces with potential function x the 
expression x + : + su is constant along the streamlines. If the fluid moves freely 


(that is x = 0) then e + su is constant along the streamlines and if in addition 


the fluid is incompressible then p =constant and p + 5pu- is constant along the 
streamlines (the standard Bernoulli Theorem) and an increase of speed of the fluid 
at a point demands a decrease of pressure and conversely. 


22.4 The Equation of Motion of a Perfect Fluid 771 


Example 22.4.1 Assume a perfect fluid is at rest and show that in this case V x 
(pf) = Ohence f-V xf = 0. This is a necessary condition for equilibrium of a perfect 
fluid. Why of must be the gradient of a scalar if equilibrium is to be possible? 


Proof From the Euler equation of motion for a perfect fluid at rest (i.e. u = 0 and 
iM = 0) we have: 


1 
ft Pir OE a ee 


where ¢ is a potential function. This proves the last part of the example. 
Now we use the identity of vector calculus: 


V(¢f) = 6V x f+vVo xf (22.25) 
and compute: 
V(epf)= pV x f4+Voxf=0 => pf-Vxf4+f- Vox f=0. 


Butf- Vo xf=Vo-fxf=0 thereforef- V x f= 0. Oo 


Exercise 22.4.1 A perfect fluid (which is not necessarily incompressible therefore p 
is not constant) moves under the action of conservative force with potential function 
x. If the flow of the fluid is irrotational with velocity potential ¢, show that the 
equation of motion becomes: 


Ip Plo 
a = (x te ) + C(t) (22.26) 


where C(t) is a smooth function. If in addition p = p(p) (this is called an equation 
of state on which more we shall say below) then equation (22.26) becomes: 


ao 1 d 
oO iP esc / Gs (22.27) 
or 2 p 


where D(t) is another smooth function. 


Example 22.4.2 A sphere moves through an incompressible perfect fluid with 


constant velocity u = uok. Find the velocity of the fluid relative to the sphere 
assuming that the motion of the fluid is steady and irrotational. 
Solution 


Because the fluid is incompressible = const. Because the flow of the fluid is 
steady and irrotational: 


du 


— =0, =V 
ot " ¢ 


772 


22 Newtonian Fluids 
where ¢ is a velocity potential. The incompressibility gives: 


V-u=0=— V*¢=0. 


At the surface of the sphere the radial velocity of the fluid vanishes.° This implies 


7] 
Uulraa = OS “¢ 


= 0. 
or r=a 
We have to solve Laplace equation V*¢ = 
| _=0 
or lr=a- 


0 with the boundary condition 


We know two solutions of Laplace equations, the r and the 1/r* hence we try a 
solution of the form: 


B 
d(r) = (4 + 2) cos 0 
r 


(we do not take solutions of the form ¢(r, tf) because the motion is steady, hence 
ob = 0). Then: 


ag 
or 


2B aA 
= Aa cosé =0=> B= —_. 
r=a a 2 


We impose a second boundary condition by requiring that at infinite distance from 


the sphere the velocity of the fluid vanishes, therefore the velocity relative to the 
sphere will be —uok. This gives the boundary condition: 


ey 
Oz 


= —uUo. 
£00 
We compute: 


=. (4) =A=—u9. 
dz OZ) 2-00 


Therefore the solution is: 


©The velocity of the fluid relative to the sphere is u = V¢ and in the coordinate system ¥ in which 
the sphere has velocity ugk the velocity of the fluid equals: 


uy = Vo+ uok. 


22.4 The Equation of Motion of a Perfect Fluid 7713 


Exercise 22.4.2 A perfect fluid revolves uniformly without change in form (i.e. the 
relative displacement of its particles does not change; rigid body motion) with 
angular velocity w around the z- axis. The flow is steady (because au = 0) 
hence the Eulerian velocity field coincides with the particle velocity field. Hence 


é - Du 
the acceleration field is =. 


1. Show that the velocity of a particle of the fluid with coordinates {x, y, z} isu = 
(—wy, wx,0). Prove that Vu = 0, hence the flow is incompressible. Find the 
Eulerian velocity field and prove that there does not exist a velocity potential. 
Compute the acceleration field (of the particle). 

2. Show that Euler’s equations of motion are: 


2 1 dp 
=pep eye ee 
p ox 

ld 

-w*y = ea 
p oy 

lo 
(2922 

p 0z 


where X,Y, Z, are the components of the external force density F acting on the 
fluid. Show that from these equations we have: 


1 
—dp=F-dr+ aw? (xdx + ydy). (22.28) 
p 


Explain the physical meaning of this equation. 
1. Show that for a homogeneous fluid and for a conservative force with potential 
function x equation (22.28) becomes: 


1 
Pas soe + yr) + x = constant. (22.29) 


Hint: Note that because the flow in incompressible and the fluid is homoge- 
neous p is constant. 
2. Suppose that the only external force is gravity and show that equation (22.29) 
becomes: 
pw 


a oe + gz = constant 


where u = wr, r* = x* + y* are the speed and the distance of the particle from 


the z- axis. Draw a picture to describe the motion of the fluid. 


774 22 Newtonian Fluids 
22.5 The Navier Stokes Equation 


As it is the case with deformable bodies real fluids are not perfect, which means that 
during their motion develop stress forces (i.e. friction like forces) between adjacent 
layers and if a test surface is immersed in a fluid the net force on the surface is not 
normal to the surface. This implies that we have to describe the physical properties 
of a real fluid by means of more tensor fields than the mass density field p(t, r) and 
the velocity field u(t, r). A fluid which is not perfect we call it an imperfect fluid. 

The Navier Stokes equation is the equation of motion of a viscous fluid (to 
be defined below) under the action of stress and external (body) forces. In order 
to understand the Physics behind this equation it is important that we reconsider 
Euler’s equation of motion of a perfect fluid 


oe = pf{—Vp. (22.30) 
dt 

The term po is the kinematic part (i.e. mass times acceleration) of Newton’s 
Second Law. It contains the density instead of mass because we use the density of 
force and not the force itself. The term pf is the total external force density acting on 
the fluid and last the term —V p is the pressure thrust on the surface of an elementary 
volume within the fluid (:p is the isotropic pressure inside the fluid). If the external 
force vanishes the fluid moves under the action of the pressure term — V p only. How 
can we understand a perfect fluid in the context of deformable bodies? We claim that 
the perfect fluid is a deformable body on which the only possible stress tensor is the 

bulk stress given by the relation: 


tuv = —P8uy (22.31) 


where p is the isotropic pressure. This means that the only strain motion which can 
take place under stress in a perfect fluid is contraction and expansion. In order our 
claim to be valid it must conform with Euler’s equation of motion. As we have seen 
the force resulting from the application of the stress (giving rise to the corresponding 
strain) is f" = t',”. Placing tuy = —p5,y in this expression we find f* = —p* = 
—(Vp)* which proves that our choice of t,,, is correct. 

The strain tensor of a perfect fluid (which is a special case of a linear elastic 
isotropic body) is computed from (21.16) if we set t,,, = — pd, . We find: 


Ey = —PBSyy (22.32) 


where is a coefficient characteristic of the material of the perfect fluid. We note 
that for a perfect fluid the principal axes of strain and stress coincide. From a 
geometric point of view we infer that the strain and the stress metric @,, and 
tyy respectively are conformally (or homothetically if p =constant) related to the 


22.5 The Navier Stokes Equation 715 


Euclidean metric 6,,).The homothety factor for the stress metric ¢,,) is the isotropic 
pressure —5 p and for the strain metric the factor — 5 pp. 

We consider now the case of a real fluid. A real fluid differs form a perfect fluid 
in the property that rapidly moving adjacent layers of the fluid tend to drag along 
the slower layers of fluid, and, conversely, the slower layers tend to retard the faster 
layers. In other words there is a “friction” amongst the various layers in the fluid 
which is due to forces which are tangent to the relative velocity of the surfaces. 
The phenomenon that the forces between adjacent moving fluid surfaces is not 
normal to the relative velocity of the surfaces at each point of their interface we 
call viscosity. Viscosity is the transfer of the usual concept of friction experienced 
by the solid bodies in relative motion. Concerning the value of the viscosity it has 
been found by experiment that (for standard fluids) the force due to viscosity is 
directly proportional to the common area A of the layers and to the gradient of the 
relative velocity normal to the flow. 

Assuming the above experimental results we determine the relation between the 
stress and the strain in a real fluid. We restrict our considerations to real fluids 
described by linear elastic isotropic fluid, which under bulk stress alone behave as 
perfect fluids. These type of fluids we call viscous fluids. 

According to the previous sections the stress and the strain tensor of a viscous 
fluid must be related by equation (21.3), that is: 


tuv = 2ne wv + Byv > Sop (22.33) 
p 


where 7, A are coefficients (not necessarily constants) characteristic of the fluid. 
This stress tensor f,,» we call the viscous stress tensor. Because for a perfect fluid 
the stain tensor is proportional to 6,,,) we rewrite this equation as follows: 


7 1 . 2 7 
tuy = 2 | @uv — 3 Prace(@)Suv + {A+ 37 Ouv > Cop 


p 


that is, we brake the strain tensor in a trace and a traceless part. The first part takes 
care the “perfect fluid” character of the fluid and the second the viscous character of 
the fluid. 

Because the bulk stress is given by — pd,,) we set (2 + 3n) eB érr = —p, where 


Pp is the isotropic pressure within the fluid and we write for the stress tensor of a 
viscous (not a general!) fluid: 


~ 1. 
tuy = 2n(€yv — 3 Cou) — Pbyy (22.34) 


where é = )>é/ and é,,, is the rate of the strain tensor of the viscous fluid. The 


p 
coefficient 7 we call the coefficient of viscosity. A perfect fluid is defined by the 


7716 22 Newtonian Fluids 


requirement 7 = 0, therefore the viscous fluids are a general class of fluids which 
contains perfect fluids. 

From (22.34) we compute the force density on a viscous fluid by making use of 
equation (21.29). We have: 


1 
FH = 1h, = 2B sy — 3b Bw) — pe (22.35) 


where we have assumed 7n,,, = 0 i.e. 7 is constant. We recall (see (20.24)) that 
the strain tensor in terms of the strain vector for a general strain motion of a linear 
elastic isotropic deformable body is given by the expression: 


1 
euv = 7 Hy + Uy n)- (22.36) 


Replacing in (22.35) we find: 


2 ,v 
Pea =e pea 3 (u?) Suv) — p™. 


p 


In standard vector notation this is written as: 
2 2 
F=n(V a Vey Ps (22.37) 


If the density of the applied external force on the fluid is f“, Newton’s second law 
gives for the equation of motion of a viscous fluid: 


Du" 
p—— = pf" + F* (22.38) 
dt 
where F* is the force density due to the stress on the fluid. Using (22.37) this 
equation can be written: 


Du" 
dt 


p 


DF axe I 
= fh 4 muy uty — = (we) Buy) — ae (22.39) 


In standard vector notation this is: 


Du n|_9 2 1 
=f+-|V*u V:-(V-w}——Vp (22.40) 
dt p 3 p 


or, after replacing Du: 


ou 


2 1 
+(u- Vu =f | vu Lv -(7-w] — —vp. (22.41) 
ot p 3 p 


22.5 The Navier Stokes Equation 7717 


Equation (22.39) or (22.41) is the Navier Stokes equation for the motion of a 
viscous fluid. The quotient ~ is called the kinematic coefficient of viscosity. We 
note that for a perfect fluid 7 = 0 so that this equation reduces to Euler’ s equation 
of motion (22.19). 

For an incompressible viscous fluid V - v =O equations (22.39) and (22.40) 
become respectively: 


Dv! 1 
= peg Te, tw) — <p (22.42) 
dt p p 
D i 
8 app ya — vp. (22.43) 
dt p p 


This equation of motion and the continuity equation are the equations which govern 
the motion of an incompressible Newtonian viscous fluid. 


Example 22.5.1 Prove Archimedes’ Law: The force exerted by a static fluid on a 
body immersed in the fluid (without changing its shape) equals the weight of the 
displaced fluid and points in the direction opposite to the force of gravity. 

Solution 

Because the body (a) is at rest and (b) does not change its shape the velocity 
v and the stress tensor t“” vanish (the fluid need not be perfect i.e. 1 does not 
necessarily vanish). Then the Navier Stokes equation reads: 


pft = p* 


where f* is the external force density (i.e. force per unit mass). The only external 
force acting on the fluid is the force of gravity, whose density is the acceleration of 
gravity g“. Then we find that the gradient of pressure is: 


pt = pg". 


Suppose the body has a volume V with boundary the surface S. Then the force 
on the body is: 


Fe= -| pn"dS 
S 


where p is the isotropic (inward) pressure in the mass of the fluid and the (inward) 
pressure on the surface of the body. dS is the (outward) element of surface of the 
body. Applying the divergence theorem we find: 


ri =— [putas =— [ ptav =—g" [ pav =—g"m 
S Vv Vv 


where m = [ pdV is the mass of the fluid in the volume occupied by the body. 
4 


778 22 Newtonian Fluids 
22.6 The Electromagnetic Field as a Newtonian Viscous Fluid 


The viscous fluid is a state of mater and not a specific substance like, for example 
water or oil. In this section we shall study the electromagnetic field in empty space 
from the point of view of a viscous fluid, that is, we shall consider a fluid of 
“photons” and assuming that the fluid satisfies Maxwell equations we shall compute 
the stress tensor and other physical quantities. 

In our calculations we shall need identity (19.7) which we rewrite here for 
convenience: 


Vu =2vx (V x v)+2(v- V)v (22.44) 
and in tensor notation (see (19.12)): 
(5Fujuj) se = 2ninjul ul + Qu‘ ug,i. 


Maxwell equations in SI units for the electromagnetic field in a material are’: 


V-B=0 (22.45) 
V-D=o (22.46) 
_ oD 
V x H= its, (22.47) 
0B 
V x E= -—. (22.48) 
ot 


In 3-d tensor notation these equations are written as follows: 


Bi =0 (22.49) 

Di, =p (22.50) 

Nuvp Ht”? = ju + Dut (22.51) 

Quiph ? =—By 4 (22.52) 

In empty space the catastatic relations are D=e jE, B = joH where ¢9 is 


the electric permeability and jg is the magnetic permeability of empty space 
satisfying the relation ego = ar: 


Example 22.6.1 As it is well known the quantity aE’ + B? is the energy density 
of the electromagnetic field. We will compute the div of this quantity using 


7Equation V - E = z in general is V - D = p. Also recall that for empty space B = jzoH and 
D= eoEK. 


22.6 The Electromagnetic Field as a Newtonian Viscous Fluid 7719 


identity (22.44) and replace V x E and V x B from Maxwell equations. We find for 
the electric field: 


aB 
VE? = 2Ex (V x E)+2(E- V)E = 2Ex (-=) +2(E-V)E 


d(E x H dE 
in ee 


=-—2 B+2(E- V)E 
ot ot re ) 
0s JE 
= —2u9— +2—xB+2(E- V)E (22.53) 
ot at 
where: 
S=ExH= €? E Hy (22.54) 


is the Poynting vector. Similarly for the B* we have: 
2 f dE 
V Be = 2B x (V x B)+2(B- V)B = 2Bxpo{ jt rs + 2(B- V)B 
: 2 dE 
= —2p90(j x B)- > a xB+2(B-V)B. (22.55) 
G 


Adding (22.53) and (22.55) we find: 
ly (aE ue B°) ek ne = —p9(j x B)+B.- VBI -V)E. (22.56) 
It is easier if we continue with tensor formalism. The term: 
(E. V)E =E"0, E” = 0,(E"E")— (0, E")E” = 0, (E" E”) — a (22.57) 
and similarly: 
(B- V)B =B"“0, BY = 0, (B" B”) — (0, B")B” = 0, (B" B"). (22.58) 


Replacing in (22.56) we find: 


E" EY 1 E2 as” 
dy, ( —— + BY BY — ~8"" | — +B?) ) + uo — = —po(pE+j x B)’. 
c oc 2 Cc ot 


780 22 Newtonian Fluids 


The term on the rhs is the Lorentz force density® (in SI units) F = pE +j x B, 
therefore we obtain the final form: 


1 Et EY 1 E? as” 
f+ a, (== + BBY — 35 (= +8) =-. 0259) 
0 


Equation (22.59) is the equation of motion for the “electromagnetic fluid”. 
We define the symmetric second rank tensor: 


E¥ EY 1 E2 
we = (== + BYBY — ~§"” (= + )) (22.60) 
c oC 2 Cc 


and the equation of motion becomes: 


1 
Y= yh? ; 
: lo “Ot 


(22.61) 


Comparing (22.60) with (13.249) we see that the stress tensor of the electromag- 
netic fluid we found coincides with the stress tensor of the electromagnetic field. 

Equation (22.61) coincides with the equation of motion of a viscous fluid (see 
equation (22.38) if we identify the Lorentz force F” with the external force and the 
tensor x” with the stress tensor of the electromagnetic fluid. In that case the vector 
S” is the momentum density of the fluid provided (S- V)S = 0. 

It is important to note that although the “particles” of the electromagnetic fluid 
have zero mass (9 = O) and zero charge there is still a change in the fluid 
momentum, which is due to the stress 7“”. The tensor 2” is called the Maxwell 
stress tensor and does exactly what has been said above. 

We recall that the stress tensor of a viscous fluid is (see (22.34)): 


é 1. 
Tuy = 2n (Env aad 3 oouv) _ Poypv- 


Comparing with (22.60) we infer that for the electromagnetic viscous fluid the 
following hold: 


a. The (isotropic) pressure is: 


8The current j = pv for empty space. For materials the rhs contains more terms. 


22.7 A Short Detour to Thermodynamics and Hydrodynamics 781 


Cc 


c. The energy of the electromagnetic field is 5 (= + B’) 


d. For the electromagnetic viscous fluid the equation of state? is: 


a= 3 


e. The viscosity of the electromagnetic viscous fluid 7 = 1. 


22.7 A Short Detour to Thermodynamics 
and Hydrodynamics 


Newtonian continuum mechanics is concerned with the motion of continuous media 
in three- dimensional Euclidian space and uses the notion of absolute simultaneity 
inherent in Newtonian Physics. Thus Newtonian hydrodynamics is formulated in 
terms of four -independent variables which may be taken to be the three Cartesian 
coordinates in the Euclidian 3-space {x'} i = 1, 2,3 and the time t.The subject is 
concerned with the description of the state of the fluid as a function of these four 
independent variables. The state is described macroscopically by five dependent 
variables, which may be taken to be two dynamic variables, such as pressure p and 
the mass density o and three kinematic ones such as the components of the Eulerian 
velocity field. 

Fluids are continua which consist of many fluid elements and each fluid element 
consists of many particles. The state of matter in a given fluid element is described 
in terms of a small number of variables called thermodynamic variables. In 
Newtonian thermodynamics we introduce the following five quantities, which are 
functions of r, t and describe macroscopically the state of the system: 

Pp = pressure, o = mass density, T = temperature, S = entropy, W = enthalpy 
(=total energy). 

The thermodynamic variables are related by the First and the Second Law of 
Thermodynamics. Therefore only three of them are independent. 

Let us consider a fluid consisting of one type of particles only and assume that 
the independent variables are the $, V, N. Then the combined form of the First and 
the Second Law is 


dW =TdS — pdV + udN (22.62) 


°The equation of state is a relation between the isotropic pressure and the energy density. This 
equation we meet frequently in General Relativity. 


782 22 Newtonian Fluids 


where jz is the chemical potential and W(S, V, NV) is the enthalpy (=total energy) 
of the fluid element. It follows that: 


r= (2) _ (7) _ ) (22.63) 
~ \as we aV ee IN] sy , 


The thermodynamic variables are characterized as extensive and intensive. A 
thermodynamic variable X is called extensive if it changes its value to kX when 
the independent variables Y change their value to kY and intensive if they do not 
change their value. For example consider as independent variables the S$, V, N.Then 
as N — kN the W — kW so that W is an extensive variable. This invariance allows 
us to find a relation among the thermodynamic variables known as Euler relation 
(also known as fundamental relation) as follows. Let W= kW, S=ks : V= 
kV, N = KN the extensive variables after a change of the size of the system. The 
thermodynamic laws give for the new values of the variables: 


dW =TdS — pdV + dN 
= k(TdS — pdV + wdN) + (TS — pV + wN)dk 
= kdW + (TS — pV + uN)dk 


We also have: 
dW = kdW + Wdk. 
Comparing these relations we infer the Euler relation: 
W=TS-—pV+wUN. (22.64) 
The extensible variables are the W, S, N therefore we define the specific enthalpy 


s = S/M and the specific entropy w = W/M which are the entropy and the 
enthalpy per unit of mass. In terms of the specific quantities equation (22.62) reads: 


1 
w=Ts— fa + pn (22.65) 


where n = N/M is the particle number density of the fluid. This equation contains 
only intensive thermodynamic variables and can be used to reduce the number of 
variables required for a complete specification of the state of the system from three 
to two. 

We choose as free variables the s, p and write for the rest of the variables: 


w(s, P), p(S, P), LCS, p). 


22.7 A Short Detour to Thermodynamics and Hydrodynamics 783 


The first law of thermodynamics becomes in terms of these variables: 
1 
dw =Tds + —dp (22.66) 
p 


the temperature and the mass density being given by the equations: 


r=(2) = (2) 22.67 
ga) 20 = ap) ee 


The type of fluid is described by specifying its specific internal energy ¢ which 
is defined as follows: 


e=w—F. (22.68) 


Each specification of ¢ is called a caloric equation of state and prescribes the 
specific internal energy ¢ as a function of pressure and density. The quantity of 
fluid is described by the number of particles in each fluid element. Using (22.66) we 
find using as variables the s, p (so that the pressure p(s, p) and the specific enthalpy 
w(s, p)): 


de =Tds + Sdp. (22.69) 
p 
In this case: 
de 
p=#(z) 
5) p 
The flow of a fluid is called isentropic if ds = 0. In this case (22.66) implies: 
1 1 
dw = —dp © gradw = — grad p (22.70) 
p p 


that is, a flow is isentropic if there is a function w, called the specific enthalpy, so 
that (22.70) is satisfied. If for an isentropic flow the pressure is a function of p only 
(and not of s) then the specific enthalpy: 


w= [Ba (22.71) 


784 22 Newtonian Fluids 


For isentropic flow the internal energy becomes: 


0 

de = dp > p=" (=) (22.72) 
p dp); 

and if p is a function of p only: 


p 


eS [Rea. (22.73) 


Chapter 23 m) 
Relativistic Fluids in Special Relativity on 


23.1 General Definitions 


Up to now in we have considered relativistic systems with a finite number of 
particles and expressed their physical quantities in terms of four vectors. In this 
chapter we extend our study to fluids which are dynamical relativistic systems 
consisting of an infinite number of particles. The way we shall follow is the familiar 
one, that is, the “generalization” of the corresponding Newtonian concepts and the 
physical quantities associated with them. 

The first issue we meet is that in space time we do not have deformable bodies 
but only geometric constructions. Then how we shall do Physics in Special (and 
General) Relativity and specifically fluids? To overcome this issue we use the 
correspondence of a physical particle with a world line in spacetime, that is we 
transfer the concept of “body” — which in the following we shall call a relativistic 
fluid — in spacetime as follows: 


a. A relativistic body/fluid is a congruence of (locally smooth) parametrized curves 
x“(s) where s is the parameter along the element of the congruence 

b. For particles with non-zero mass the elements of the congruence are timelike 
curves and for particles of zero mass (e.g. photons) null curves. 

c. The curves of the congruence are the integral curves of a differentiable vector 
field u* which for a timelike congruence has length u“u, < 0 and for null curves 
u“uq = 0. The parameter s for which u“ug = —1 we call the affine parameter 
along the congruence and denote with t. The physical meaning of Tt is that it is 
the proper time of the particle whose world line is the element of the congruence 
whereas uv“ is its four-velocity. 

d. At each point P of a timelike congruence/fluid the observer whose four-velocity 
at P equals u“ we call the comoving observer at the point P. 


As it is the case with the Newtonian fluids we have to define the kinematics and 
the dynamics of a relativistic fluid. The first concerns the fluid/congruence itself 


© Springer Nature Switzerland AG 2019 785 
M. Tsamparlis, Special Relativity, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-27347-7_23 


786 23 Relativistic Fluids in Special Relativity 


and it is defined by the 4-velocity u@ of the fluid and the 1 +3 decomposition of 
the derivative u,., of the fluid along u*. The dynamics concerns the interaction of 
the congruence/fluid with the environment and requires more geometric objects in 
order to be defined. 

As we have shown in Newtonian Physics the dynamics of a Newtonian fluid 
requires the introduction of the stress and the strain tensors, which are second order 
tensors. Therefore we expect that in order to generalize the concept of a fluid in 
Special Relativity we have to introduce corresponding second order tensors. These 
tensors in order to obtain their physical meaning will have to reduce under certain 
limiting conditions to the ones of Newtonian Physics. Of course, as it is the case 
with some vector relativistic physical quantities (e.g. the spin), one should expect 
that there shall be tensor relativistic quantities associated with a relativistic fluid 
which do not have a Newtonian analogue. However these will not be considered in 
what follows. 

In order to define the appropriate geometric object for the description of the 
dynamics of a relativistic fluid with its environment we have to geometrize both the 
effect of the environment and the corresponding response of the relativistic fluid. To 
do that let us recall briefly the case of a Newtonian viscous fluid. 

A Newtonian viscous fluid is a linear elastic isotropic continuum which is described 
by the scalar field p(t, r) of mass density and the (Eulerian) velocity vector field 
u(t, r). Furthermore these fields satisfy the following two dynamical equations: 


1. The continuity equation (22.11) for the mass density (conservation of mass): 


Dp dp 
Vu) = — =0 23.1 
pr ee 3 ve (23.1) 
2. The Navier Stokes equation: 
Dp" , 
— = Fu 23.2 
poo (23.2) 


where p“ = pu" is the momentum, f” is the external force density (producing 
the inertial motion) and F is the force due to the interaction of the particles of 
the fluid (producing the stress tensor). The force F is given by the contracted 
derivative of the stress tensor t,,, (see equation (22.35)) as follows!: 


Fe = 4h, (23.5) 


‘In particular for a viscous non-conducting fluid the stress tensor 
t!” = na — pdt? (23.3) 


where 7 is the coefficient of viscosity, ty = (U py +U yp — Zu? du) is the traceless part of the 
stress tensor, p is the isotropic pressure (the trace of the stress tensor). Then 


FY = nr”, — pl 


23.2 Relativistic Fluids in Special Relativity 787 


Obviously the key equation to be used in the description of the dynamics of 
relativistic fluids is the Navier Stokes equation (23.2). This equation contains the 
stress tensor via the equation (23.5) whereas the rest of the quantities are the 
scalar p and the vectors p“, f”. We conclude that concerning the dynamics of 
a relativistic fluid we have to invent a “stress tensor” which will generalize the 
relativistic Newton’s Second Law while will reduce to the Navier Stokes equation 
at the Newtonian limit. 

In order to do that we recall that the stress tensor of the electromagnetic field 
considered as a Newtonian fluid (see (22.60) is contained in the energy momentum 
tensor of the electromagnetic field we found in (13.249) where we were studying 
the dynamics of the electromagnetic field. Using this result as a guide we decide to 
describe the dynamics of a relativistic fluid in terms of the energy momentum tensor 
which we shall (somehow) define. This is the major task we answer in the following 
sections.” 


23.2 Relativistic Fluids in Special Relativity 


In order to associate physical quantities with a relativistic fluid it is necessary that 
we introduce some new concepts which are also used in Newtonian fluids. 

The relativistic physical quantities must be defined in accordance with the 
Covariance Principle, that is they must be expressed in terms of Lorentz tensors. 
The first concept we have already defined is that of a comoving observer u“. Next 
we consider in the rest space of a comoving observer an elementary volume d Vo 
and propagate dVp with the u“ so that at all times dVo contains the same particles 
(both in number and identity). This elementary volume we call comoving volume 
with the fluid. Let n the number of the particles of the fluid in the volume dVo. If 
the number n does not change along the integral curve of u“ we say that there do 
not exist sinks and sources in the volume dVo. 


The dynamics of a Newtonian viscous fluid is given by the Navier Stokes equation (23.2) which in 
terms of the momentum density p“ = pu" is follows: 


Dp" 
dt 

?The concept of energy momentum tensor is present in all branches of Physics, even in Newtonian 
Theory, therefore it is useful to derive it using general concepts and covariant methods, which 
(with the appropriate adjustments) apply to all physical theories. This means that the concept of 
the energy momentum tensor is of a geometric nature which is revealed in each theory by the 
decomposition wrt a non-zero characteristic vector field of the theory. The results obtained here 
are valid equally well in Newtonian Physics, Special Relativity and General Relativity in the sense 
that one obtains the corresponding formulae in each theory by replacing in the general results the 
special metric, the dimension and the characteristic decomposing vector field of the theory. 


= pf! +nrh” — p*, (23.4) 


788 23 Relativistic Fluids in Special Relativity 


We define the physical quantity particle number density (0 of the relativistic 
fluid as follows: 


5, = in (23.6) 
Po = dVo : 


where dn is the number of particles in dVo. Po is a scalar (not an invariant!) whose 
value is determined in the comoving frame where both the 3-volume d Vo and the 
number dn of particles it contains are counted by vu“. Using the physical quantities 
Po and u“ we define the particle number current ie as follows: 


j= Dou’. (23.7) 


Let & be another observer for which the four velocity of the comoving volume 


: c : bee, 

isu’ = ( Y ) . In order to compute particle number density py in & we have to 
yv 

know how both dn and dVo transform. Concerning the 3-volume d Vo we know that 

under a Lorentz transformation the 3-volume transforms as: 


_ dV 
= 


dV (23.8) 


Concerning dn we assume that in dVo there do not exist sinks and sources so that 
dn does not change. Therefore equation (23.6) gives*: 


Py = Poy: (23.9) 


Similarly the components of the particle number current 7 in & are: 


= i ) = = ) (23.10) 
Y yv >») Psv >») 


We compute the divergence of 7 in order to find if it satisfies a continuity 
equation, that is, an equation of the form j = 0. We have 


=a 0 ory = I(Py c) 


- dp 
Ja=Jotl y= act + V(Psv) = 


Dp 

x = x 

+ = +95 V 

at V(PzV) F PrVV 
(23.11) 


where 2 is the total derivative. It follows that the vanishing of the divergence of 


the current 7 is equivalent to the requirement that the number of particles in the 


Use that dn = PydV = podWo. 


23.2 Relativistic Fluids in Special Relativity 789 


volume d Vo does not change in time.This result is known as the law of preservation 
of particle number and it is expressed by the continuity equation’: 


Jaq = 29. (23.12) 
The vector ra characterizes the number of particles in dV but not the type of the 
particles. Therefore we have to look for more current vectors defined for the fluid. 
In Special Relativity the type of a particle is characterized (to the lowest degree) 
by its proper mass. We assume that the fluid consists of one type of particles of 
proper mass moc’ and define the new scalar matter density po of the fluid in the 
comoving frame as follows: 


0 = Pomoc’. (23.13) 


If py is the energy density of the fluid in the Lorentz frame & considered above, 
then using the transformation of energy: 


Ey = yEo = ymoc” (23.14) 
we find: 
py = Poy’. (23.15) 


Using the matter density we define the matter density 4-current j“ to be the 
4-vector: 


J = po", (23.16) 


In the frame & the components of the vector j“ are: 


1 
pa ) == (Pr ) (23.17) 
y>\yu J, yv\pru /y 


4 A continuity equation corresponds to the conservation of a scalar quantity which defines a vector. 
For example in the present case the conservation of the scalar quantity which multiplies the four 
velocity in the corresponding current is the particle number density and the continuity equation is 
the conservation of this quantity. Another example from Newtonian Physics is the conservation of 
mass which concerns the divergence of the current j = mv and it is given by the relation 2” = 0. 


“dt 


5A more “classical” proof is the following: 


dm dE _ ydmgc? a) 
dV C2dV 1dVo Y po- 


p= 


One can prove the same result using (23.9) and (23.14). 


790 23 Relativistic Fluids in Special Relativity 


We compute again the divergence of the matter density 4-current. We find: 


’ . 1 D 1 ’ . 
oF viosw)| E “ji = eae py Vu Z a be 
Y Y Y 


(23.18) 


ol [ave 


4" y | act 


Therefore we do not have an equation of continuity for the matter density current. 

This is as far as one can go with the relativistic concepts introduced so far. In the 
following we shall develop the formalism for the study of the kinematics and the 
dynamics of a relativistic fluid. First we need some more mathematical results. 


23.3. The 1+3 Decomposition wrt a General Non-null 
Vector gq“ 


Consider a (four dimensional) spacetime endowed with a metric gg, and let g® bea 
non-null vector (i.e. g7 = ¢(q)ganq’q’ & 0 where ¢(q) = £1 is the sign of q“) in 
this spacetime. Then the tensor: 


é 
hab(q) = Sab — Dut (23.19) 


projects normal to q@ i.e. hay(q)q? = 0. Furthermore it is easy to prove that the 
tensor hgp(q) satisfies the properties: 


hav(q) = hpalq), hG(q) = 3, — hav(qyh”*(q) = hG(q). (23.20) 
Using the projection tensor hgp(qg) we are able to 1+ 3 decompose any tensor 
along and normal to the vector q“. 


Exercise 23.3.1 Let A“ be a vector in the same space as q“ (q* # 0, A% can be 
null). 


(a) Show the identity: 
A‘ =| At +L. A‘ 
where | A% = “P (qnA?)a", LAY = he(q)A?. This (covariant) decomposi- 


tion of A% we call the 1 + 3 decomposition of A% wrt q°. 
(b) Show that L (, A“) =, A“ and \\(yA‘% =| A‘ 


23.3 The 1+3 Decomposition wrt a General Non-null Vector q“ 791 


Exercise 23.3.2 Consider a second rank tensor A®” and by contracting with 
q°q°, q¢th’(q), h@(qyq’, h’(qyh>(q) show that the following identity (we 
emphasize: mathematical identity!) holds: 


1 e(q) 
Acd = (<Auna"a’) Geda + “axe (q) Acavyq da 


é(q) 
+ pita 4(q)Aaby9? de + A (q)h5(q) A(ab) + Afed)- (23.21) 


This identity involves the irreducible parts: 


1 &(q) 
7 Aang" e Ge he DAtana: h@(q)h5(q)Acab), Atab] 


and defines the I + 3 decomposition of the tensor Agp along the vector q* . In matrix 
form we write symbolically: 


JA b EG) py ( )A b 
Aab = fp . 4 ie r a Gna + Alab] (23.22) 
hE(Q)Acavya? Acanyhe (yh q) s 


where & is an arbitrary coordinate pane. This tensor can be decomposed further 
if we write the symmetric tensor he | (qyh’ 4) (q) Aap in terms of a traceless part and a 
trace as follows: 


hé(qyho, (q) Aav = Acanyh’(q)hi(q) 


1 1 
= = Zh @Acnhea(@) + (isco (q) - sh" ahea(@) A(ab): 


In this case the 1+ 3 decomposition of Aqp reads: 


1 é(q) &(q) 
Acd = (=Auna"a’) Gce4a + “a MeCG)Acanya? da + “MD Acanya? det 
(23.23) 


1 
+ sh? Q)Acanyheaq) + (1ecarnt (q) - shaved «) A(ab) + Ated 
and leads to the following symbolic formula in matrix form: 


qt Aanga? “PREC ACan a? 
| &@ pacgya Lab )A(abyhea(q) + (h8(qyh8 (gq) — 4h? (q)tca(q)) A 
re cl (bya? QD) 4 ab)"cd\4 Dang Q)"cd\4 (ab) 


+ Ajab). (23.24) 


792 23 Relativistic Fluids in Special Relativity 


We conclude that by the 1 + 3 decomposition, a general second rank tensor Ag» 
is specified by (and specifies) five different tensors: 


fo) 


. Two scalars: Acang’a’, h® (q)Acab) 
b. One vector: WP A(anyg7hh(Q) 


Q 


. One traceless symmetric second rank tensor: (ne (qyh®, (q)- i Ned (qg)h@? (@) Acab) 
d. One antisymmetric second rank tensor: Ajapy. 

These tensors are the maximum possible elements that one can employ to 
describe the tensor in reference to a non-null vector field q%. 


For a symmetric tensor Ajay] = 0 and the remaining quantities are the ones we 
shall use to describe the dynamics of a relativistic fluid.° 


23.4 The Extended Metric and the Bivector Metric 
Having given the basics on the 1 + 3 decomposition of a vector field and a second 
rank tensor we consider the projection tensor hgp(q) and we note the identity:’ 


e&(q) 
hap(q) = “pr abBed — 8ac8ba) qq" - (23.25) 


Identity (23.25) expresses the projection tensor hg,(q) in terms of two tensors. The 
tensor: 


S8abcd = 8ab8cd — 8ac8bd (23.26) 


cd 
which depends only on the metric of the 4-space and the tensor a which 


depends solely on the decomposing vector field gq“. The new tensor gabcq we call 
the extended metric of the space. In terms of the extended metric the projection 
tensor hgp(q) is written as follows (use (23.25)) 


€(q) 
hav(q) = —e Sabed 1d" (23.27) 


We note that the results of this section can be generalized easily to any finite dimension. 
7 
Proof 


€(q) é(q) €(q) 
hab(q) = Sab qe (ae = Bab — 3 Backrdqq = EDL Bab ~ 8ac8baq'q") 


£(q) : e@) 
= — > (8abScd _ Sac8bd)g 4g" = a tevedd a. 
q q 


23.4 The Extended Metric and the Bivector Metric 793 


It is an easy exercise to show that the extended metric satisfies the following 
symmetry properties*: 


8abcd = 8albcld = 8cdab = 8badc; 8a(bc)d = 0 (23.28) 


Besides the extended metric we may introduce the bivector metric Sgpcq which 
is defined as follows 


1 


8abed = — 2 Nabrs Ned : (23.29) 


The bivector metric has the following symmetries 
Sabced = 8{ab\cd = Bablcd] = Scdab- (23.30) 


The bivector metric has the following characteristics 


1. It is expressed in terms of the metric ggp as follows? 


Sabed = 8ac8bd — 8ad8bc = Sacdb- (23.31) 


2. It satisfies the relation 
€(q) - 
hav(q) = ra Baabeq@q" 


that is, the projection tensor is also possible to be expressed in terms of the 
bivector metric. 

3. The symmetries of gapcq are the same as the (index) symmetries of the (Riemann) 
curvature tensor. Due to this property, the bivector metric is involved in the 1 +3 
decomposition of the Riemann tensor.!° 


8 According to our standard convention round brackets () denote symmetrization wrt the indices 
they enclose and square brackets [ ] anti-symmetrization. 
9 


Proof 


1 


1 
— 5 Madki Ned" = —5(-2)6055 _ 5) 5K) ke Bid 8ac8bd — 8ad8be = S8acdb- 


10For example the Riemann tensor Rapcq of a Riemannian connection of a space of constant 
curvature is as follows: 
R =a 
Rabed = — G@—Da—z Sabed when n > 2 


R- 
Rabed = F 8abed When n = 2. 


794 23 Relativistic Fluids in Special Relativity 


The question arises: 

Both the extended metric and the bivector metric can be used to express the 
projection tensor hgp(q). Which one should use? 

The bivector metric has been used by Dixon!! in the description of relativistic 
fluids in Special Relativity. However as we have remarked due to its symmetries the 
bivector metric is appropriate for spaces of constant non-vanishing curvature; but 
Minkowski space is flat. Therefore in this book we shall use the extended metric 
which is independent of the curvature of the space; furthermore it is defined quite 
naturally in a general curved space not necessarily of constant curvature. 

Having decided to use the extended metric we write for the length of the normal 
projection of a vector A“ : 


e(q) 
1A? = 1 Agi A® = hep (QAP A? = qe gave A ANG! (23.32) 


It follows that the length ; A? depends on two parts: 


a. The part gapcqA¢A” which depends on the decomposed vector A“ while it is 
independent of the decomposing vector g“ 
b. The part oe which depends only on the vector q°. 


This leads us to define a new second rank tensor Eg, depending only on the 
vector A“ and the space extended metric gapcq by the formula: 


Eqp = Xgabea Ac A‘. (23.33) 
where X is an invariant (constant) to be determined. The tensor Eg, is a symmetric 
second rank tensor therefore it can be 1 + 3 decomposed wrt the vector field g“. We 


expect that the irreducible parts of this tensor will provide the necessary quantities 
we need in order to describe the“motion” of the vector A“ in spacetime. 


23.5 The 1+3 Decomposition wrt the Four-Velocity u% 


We specialize the results of the last section by considering q“ to be the 4-velocity u“ 
(u“uUq = —c?) of some observers. From (23.27) we have for the projection tensor 


hav(q) : 


1 
hap) = — Barca ul. (23.34) 


See “Special Relativity: The foundation of macroscopic Physics” by W.G. Dixon Cambridge 
University Press (1978) Chap. 2. 


23.6 The Kinematics of a Relativistic Fluid 795 


For an arbitrary vector A“ this implies the 1 + 3 irreducible parts: 


1 1 
\A2 = = (Apu? yu", 1A% =hequ)A’, 1A? = — = Babeau ul AGA. 
(23.35) 
For a general second order tensor Ag» formula (23.23) gives: 


1 i 1 
Acd = (<Aunw'u') Ucud — — he (u) A(avyu? ua - 4 (u) A(avyu? ue 
Cc Cc Cc 
1 
+ gh’? W)Acayheau)+ (23.36) 
1 
rs (recone = sha(u)h )) Acab) + Atab] 


which in matrix form is: 


poe Fr Acapyuu? — he uA (anu? 
SV 4 ng(u)Acapyu? 4h@?(u)Acapyh h@(uyh® (w) — Ahea uh” (u)) A 
2 qa) (ab)u= 3 (u) (ab) cd (u) + (he) a (u) 3 "cd (u) (u) (ab) x 


+ Afab]- (23.37) 
From this relation (mathematical identity!) we conclude that in Special (and 


General) Relativity a 2-index tensor wrt the 4-velocity is specified by (and specifies) 
the following five different tensor fields: 


a. Two scalars: = Acapyutu’, h®(u) Aap) 
b. One vector: — A apyuth? (u) 


Q 


. One traceless symmetric tensor: (ne (uh) (u) — i Nea (u)h*®(u)) Acab) 


Qa 


. One antisymmetric second rank tensor: A[qp). 


Therefore if we consider a second order tensor in the description of the motion 
of a physical system then any class of observers u“ will have at its disposal only the 
above tensor fields to study the development of the system in spacetime. 

After this mathematical preliminaries we are at the position to study the 
kinematics of the relativistic observers u“ and the dynamics of matter as observed 
by the observers u“. 


23.6 The Kinematics of a Relativistic Fluid 


Given a 4-velocity field uw“ in a certain spacetime one computes the second rank 
tensor u,,p. This tensor is not symmetric and contains information on the 4-velocity 
only. Therefore it defines the “kinematics” of the observers uw“ that is, the relative 


796 23 Relativistic Fluids in Special Relativity 


motion!* between any pair of observers in the congruence of the integral curves of 
the vector field uv“. The 1 + 3 decomposition of ug, wrt uv“ is given by the following 
identity: 


1. 1. 1 
Ua b = 502 tall am gen ea + Ufa,b| + Ogp + 3 Ohab 
1. 1 
= ae + Wab + Cab + 3 Ohad (23.38) 


where we have set!?: 
-O=h* ups = u", 
Ug = in gue 
K 1 s 
Sab = (high) = zhavh"™ uy,s 
Mab = hi hits) 


Bese 


In order to give a kinematic interpretation to the quantities ogy, Map, Ua, 0 We 
consider these quantities as describing the corresponding quantities of a Newtonian 
fluid. Along this line we consider an inertial frame and then give each quantity the 
following kinematic role. 


6: Measures the change of the relative spatial distance between two observers 
Ua: Measures the acceleration of the fluid of observers in the inertial frame 
Oab: Measures the anisotropy of the relative motion in the spacelike plane normal 


@ap: Measures the relative rotation in the spacelike plane normal to the observer 
congruence 
In accordance with their kinematic role we name @ the expansion, ogp the shear 
(should be called the strain), wg, the rotation and ug the 4-acceleration of the 
timelike congruence of the observers u“. The quantities (i.e. 0a, Mab, Ua, 9) 
have a fundamental role in relativistic kinematics and dynamics of the relativistic 
fluids (both in Special and in General Relativity). 


23.7 The Dynamics of a Relativistic Fluid 


As we have already decided the dynamics of a relativistic fluid requires a second 
order tensor which we called the energy momentum tensor. This tensor will be build 
from the available elements we have, that is, the matter density po, the 4-velocity 


The relative velocity is defined by the Lie transport of development of the connecting vector 
along the elements of the congruence. However we shall not comment further on that. 


13Note that ug. pu% = 0 because u“ug = —c?. 


23.7 The Dynamics of a Relativistic Fluid 797 


u® and the Lorentz metric ng,. Before we proceed with the Physics we look at the 
tensor 7,, from a mathematical point of view and pose the question: 


What is the amount of information a symmetric second order tensor is possible to provide 
to the comoving observers wu“ of the fluid? 


Let us do some counting. A general tensor of second rank in a four dimensional 
space has 47 = 16 components. But we only need four, as many as the equations 
we wish to derive (that is equations (23.1) and (23.4)). This implies that we have 
to restrict T?? by imposing conditions, which will restrict the number of free 
components to the number four. The first requirement we impose is that it will 
be symmetric, that is 7¢’ = T°. This is reasonable because it must correspond 
somehow to the stress tensor of the Newtonian fluid and the latter is symmetric. 
This requirement reduces the number of independent components to e = 10, still 
too many. We further demand that the divergence of this tensor Ma shall be equal 
to a definite four-vector (which we must define) and then we end up with the correct 
number of equations. Now from equation (22.35) we see that this vector must be the 
force which gives rise to the generation of stress into the relativistic fluid. Therefore 
the scenario appears to work and we go on. 

The energy momentum tensor geometrizes the interaction of the fluid with the 
environment. This interaction is described mathematically by the irreducible parts 
of the energy momentum tensor resulting from its 1+ 3 decomposition wrt the 
vector field u*. This decomposition is covariant, in the sense that under a coordinate 
transformation each irreducible part transforms as an independent tensor. From 
the Physics point of view each irreducible part corresponds to a physical quantity 
describing the flow of the fluid under the given environment (i.e. the Typ). 

In the following we shall introduce the energy momentum tensor using two 
approaches. The first approach is more mathematical and involves the extended 
metric and the motion of a single particle. The second approach is less formal and 
uses “reasonable” and “physical” assumptions. 


23.7.1 The Case of a Single Particle 


We consider an isolated particle of mass moc’ which in a coordinate system © has 
velocity v, acceleration a and moves under the action of the 4-force f%. Let us 
assume that the four-velocity of the observer comoving with the frame © is u%. 
Then the four-velocity v“ of the particle is 1 +3 decomposed wrt u“ as follows: 


vt = vou? + AS (uu? =) v2 + 14 (23.39) 


798 23 Relativistic Fluids in Special Relativity 


where v? = — (v4 Uq). In terms of components we have!* 


Pal) w= (5) (23.40) 
YV/ 5 0/5 


As we have seen the length of the space part yv of the 4-velocity v% is given by the 
formula (see (23.35)): 


1 
Lv? = (VV) = —ZBareav*v' uu! (23.41) 


The kinetic energy of the particle in & is: 


2 
1 
= = = Smolvv) = 55 Bapoavtv uu", (23.42) 
This can be written: 
QT = E,putu? (23.43) 


where Eq» is the symmetric second rank tensor we introduced in (23.33) with X = 
— 
, that is 


mo 
Eqb = oy Sabeav°v". (23.44) 


We note that the tensor Eq depends only on the metric of the space and the four- 
velocity v% of the particle while it is independent on the particular frame of reference 
x where the motion of the particle is studied. Therefore it describes the covariant 
(that is, independent of the reference frame) part of the motion of the particle in 
spacetime. 

We 1+3 decompose Eg, wrt u“ by computing its contractions with the tensors 
utu? whe, Rene. 

Fit the first sonnestien we have from (23.43) 


* 
E,quut = 2T = moy*v" 


where T is the kinetic energy of the particle in the frame of u“. 


14Note that: 
1 1 
hapv? (gap 4t pUgup)v? =Uat 2 (upv?)ug Ua t =z ( yc7)ua = Va — YUa.- Ina Lorentz 


frame the eogtdinates are: hgpv? = ( yc, vv) — y(—c, 0) = (0, yv) =1 vg. We shall need this 
result below. 


23.7 The Dynamics of a Relativistic Fluid 799 


For the second contraction we compute: 


E-qu‘hf (u) = —F gancav’vu hf (wu) 
= —F (Babsed = SacSha)v v uc ht (u) 
= = (ugh w) = (uae dha wv?) 
= y Watt)hei(u)v? 


1 
= vate \hroi(u) p? (23.45) 


where p“ = mov“ is the four-momentum of the particle. 
In the proper frame & this is written 


Beau‘ a) * ( uv ) =( a ‘ ) (23.46) 
—yP/ 5 —moy*v/ 5 


that is, this term is the linear momentum of particle in X. 
Concerning the space part WH Ean we compute! 


'SThere is another way to write this result. We have: 


mo 
Ecah¢hé = — (hrs = Diag hagtl”) 
mo b 
= 7a (10 = Hat) vv’ hrs — harv“hpsv ] 


mo 


5( ye a “hrs 4 + (haphrs har hs) "0 


mo 


a + (haphys — Narhps) vv" | 


_ mo 
a (uv) 2 hrs + haphrsv" v? — harv“hpsv , 


mo 
= moy7 hrs = Por (haphys — harhps) vty? 


E? mo 
a,b 
_ wack NabrsV°U 
0 


where E = moyc? is the energy of the particle and: 


habrs = havphrs — harhps (23.47) 


is the extended metric in the rest space of the four-velocity u“. 


800 23 Relativistic Fluids in Special Relativity 


mo 
Ecah’(u)h4(u) = =z Babed YU hy (u)hs (uw) 


mo 
= — a (SabSed a SacSba)viv h°(uyh? (u) 


= OC hips (uw) — har Uv Nps (uv) 
Cc 
= Mo (i, (u) ae Shr (u)v“hps wr?) aes 


In the proper frame of u@ we have!®: 


0 0 
Ecah{(uyh4(u) = mo} 0OB+45y?vev] . (23.49) 


x 


There is a more compact and instructive way to write Eqph’(u)h?(u). Indeed 
from (23.48) we have: 


Ecah{(uyh4(u) = mo (1.00 + Oe (wre) 
c d 1 c d 
= mo (secon; (U) cd + car (uyhs (wucus) 
c d ( 1 ) 
= moh; (uyhs (Uu) | &cd + SUcVa 


C2 


= moh* (uh? (u)hca(v) (23.50) 


where Ieg(v) = 8cd + Ved is the projection tensor associated with the four- 
velocity v%. 
We break (23.49) in a trace and the trace-free part. For the trace we compute: 


1 
A“ (u)Eca = h’® (uw) Ecah’ (uh? (u) = mo (3 + Hurts) 
c 
1 2 1 by2 
= mo |3+ 5(-c*) + Glupu’) 
c c 


= mo 2 ah acre? | 


= mo(2+ y”). (23.51) 


‘©The notation = indicates the evaluation of the tensorial quantity in a specific coordinate system 


23.7 The Dynamics of a Relativistic Fluid 801 


For the traceless part we find: 
c d i cd 
h, (u)h, (u) — - (U)hys(U) ) Eca 
— 1 a b 1 2 
= mo | Ars(u) + cahralu)u hps(u)ve | — zmo2 + y*)Ars(u) 


1 2 i a b 
= m0 3 — Y*)hys(u) + ca ltra(uyy hps(u)v (23.52) 


We collect the above results in the following formal matrix (note that the (0, 0) 
element is the > EE? yty?): 


1 1 

427 —lyp 

Ew=(°, ie (23.53) 
—cYP mo (13 aaa 2 v@v) 


or due to (23.50): 


a5 _tl a b 
Egy = (22% =BvatetVhandp”—Y or 
2 Wau hou) p? moh; (u)hs U)hcav) 


In the Newtonian limit y — 1 and in the proper frame of u“ the t = ct where t 
is the universal Newtonian time. In this frame (23.53) becomes 


2T —p ) 
Ene 23.55 
’ ee moh(v) pu» 27) 


that is, Eg» is defined solely in terms of the kinematic quantities of the particle in 
and the metric of the Euclidian space. In the rhs we can extract the invariant mo 
outside the matrix and write: 


nen eee (23.56) 
ON Wty @w J , 


23.7.2 The Case of a Single Particle Under the Action 
of Forces 


We consider now the case of a particle moving under the action of the force f’. 
In this case the equation of motion of the particle is given by the (generalized) 
Newton’s law: 


= fi (23.57) 


802 23 Relativistic Fluids in Special Relativity 


where f' is the collective four-force acting on the particle. The ap =pi= 5 pi ju : 
We are calculating the derivative Eg, = Egp.cu® and show that this quantity 
contains all the information concerning the equation of motion. We have!’: 
. mo by. 
Ecd = ~~ 8abed (ve v ) 
c 


a 1 a,b). _ 1 _ aby. 
= ———5 Sabed(p" pp)’ = ———5 (SabScd — Bac8bd)(P’ P ) 
moc Moc 
= ——5(—mgc’ Sea — PePa) 
moc 
2. _ 2 f 
Moe P(cPd) = moce (c Pd) 
2 
We compute the decomposition of the 2-tensor Ea wit v“. We have: 
. 2 
Eqpv'v? = 3 (fav")(vpv") = —2( fav") 
7 a,b 1 a,b b 
Ego h.(v) = a2 (favo + fbva) v°he(v) = —h.(v) fo 
Eqph@(vyh®(v) = 0. 
From these relations we calculate: 
a I . a,,b 
fav = — 5 Faby v (23.59) 
hb (v) fo = —Eanv*h2v) (23.60) 


from which follows that the force in the proper frame of v“ is determined completely 
by the tensor Egp and the four-velocity v“ as follows 


oe — 
f= x3 (Eavv"v?)v" — Eqpv"h”™ (0), (23.61) 
Cc 


We note that the effects of wu“ (i.e. the coordinate system ¥) are exclusively in the 
arc length s whereas the four-force can be described completely in terms of the 
tensor Eg, and the four-velocity v% of the particle. 


'7Note that the derivative of the metric vanishes because the metric is flat, therefore in a Cartesian 
frame the components are constant. 


23.8 The Equation of Motion of a Relativistic Fluid 803 
23.8 The Equation of Motion of a Relativistic Fluid 


In Sect. 23.7.1 we have shown that with a particle one can associate the second rank 
tensor Ez, whose components are the kinematic quantities (mass and momentum) 
of the particle and that the tensor E,, determines completely the motion of the 
particle. In this section we generalize this concept to the case of a fluid and show 
that the equation of motion is obtained by the divergence of the density of the tensor 
Eqp. To do that we consider an elementary comoving volume d Vo in a relativistic 
fluid consisting of one type of particles and assume that the matter density of the 
fluid in the proper frame of the comoving volume dVo is po, Let u“ be the four- 
velocity of the comoving observers in d Vo and assume that in their proper frame, & 
say, the velocity of the fluid is v. This implies that the matter density of the fluid in 
the frame ¥ is p = poy”. The density!*® of the tensor Eg, is given by 


PO d 
Eap = —FBabcd VV : 
Cc 


We compute the divergence E a in the frame & and compare the result with the 
equations of motion of the fluid, that is, equations (23.1) and (23.2). We have for the 
zero component!?: 


1 
E™ = E™ 5+ E™ , = (ooy"B”),0+ — (por VY. 


1 
(po(y? — 1)),0 + — (PV) 


1 
(poy? — po).0 +-(pv") 
Cc 


1 
= po — p00 + caer 


I 
= 
ie) 

3 


! im 
ap 7 P00 + — Pv (23.62) 


where t = ct is the proper time of the comoving observer. 


'8We use the same symbol for both the tensor E,, and its density. The reader should note this to 
avoid confusion. 


'Recall that when we rise the 0 index we change sign! Also note that the derivative, 0 is wrt ct. 


804 23 Relativistic Fluids in Special Relativity 


Concerning the space part we have: 


1 1 
EM V= E” 9 +E = ~ (oovv"),0 + G (o" + are’) 


He 
al v 1 Ve pv 
= ~ (ev ).0 + GOV Wu +P 0,u8 
« 1 D(pv’) 1 
er a 
* 1 D(p”) pv 1 VU 
= + p0,n0"" + 7aP ven (23.63) 
where p” = pv” is the momentum density of the fluid. 
Therefore finally we have (for v“, y = O, i.e. incompressible flow): 
1 Dp 1 Dpc 
b * Cc dt. — 0,0 * “dt. — P0,0 
E% + ( FB )+("°)*3( ae )+(—@*) 
2 dt PO, ON ar PO, 
D(E/c 
an i ue + — 0,0 
2 Nee 0 
dt ad 
1 Dp* 
~ PL p:"(where t = ct). (23.64) 
c dt 


We see that the quantity E“” » gives the lhs of the generalized Newton equation. 
Therefore we can write this law as follows 


1 
Bey = CF + FY) + 05" (23.65) 


where f“ contains all the external inertial forces which act on all the volume dVo, 
F® contains all internal forces due to the strain motion of the particles of the fluid 
in the interior of the volume element dVo and py is the change in the momentum 
due to the change of the mass in the volume dVo. The internal forces are the stress 
in the fluid. If we denote the stress tensor of the fluid as tg, then FPF“ = eP (see 
equation (22.35)) as it is the case in Newtonian fluids. Then we define the energy 
momentum tensor of the fluid as follows 


FP? 27 a? (23.66) 


and the equation of motion of the volume element d Vo is: 


1 
= - f" — pp. (23.67) 


23.8 The Equation of Motion of a Relativistic Fluid 805 


This is the Navier Stoke’s equation generalized to a relativistic fluid in which 
there may be sinks or sources of matter. 

The above scenario is fully covariant and applies directly to General Relativ- 
ity provided one replaces the partial derivative with the covariant (Riemannian) 
derivative. 


23.8.1 The Dynamical Physical Quantities of a Relativistic 
Fluid 


The energy momentum tensor T,, contains all physical information we need in 
order to describe the motion of a relativistic fluid in a given external environment. 
Furthermore it is independent of any particular observers u“ and geometrizes both 
the internal forces (stress) of the fluid as well as the environment which modulates 
the motion of the relativistic fluid. Therefore it is the appropriate geometric quantity 
which one should use in order to define the physical quantities of the relativistic 
fluid. In order to do that we 1+ 3 decompose the 7, wrt the observers”? u@ using 
the general formula (23.36). We find the following decomposition (mathematical 
identity!): 


Tab = Ugly + Phan + 2q(aUo) + Tab (23.68) 
where we have introduced the tensors/irreducible parts (here u“ug = —1): 
p= Taputu? (23.69) 
1 

p= a Tie (23.70) 
gS hat (23.71) 
Tab = (hi hi, — 15 h’’)T, (23.72) 

ab ab 3 ab rs: . 


As expected the 1+ 3 decomposition of T,, produces two scalar fields (j2, p) 
one spacelike vector (q“,qqu“ = 0) and one traceless symmetric 2-tensor 
(Tab, ga = 0). 

The tensors 1, P, da, Tab define the physical quantities of a relativistic fluid. We 
assume that they have the following physical meaning: 

The scalar jz corresponds to mass density, 

The scalar p corresponds to isotropic pressure, 


20We note that in this approach the observers are not considered to be part of the physical system. 
This distinction between observers and observed systems reflects the sharp distinction between 
observer and observation which lies at the roots of non-quantum Physics. 


806 23 Relativistic Fluids in Special Relativity 


Table 23.1 Types of energy — momentum tensors 


LL Dp q* a” | Tap Type of fluid 

0 0 0 0 0 Empty space 

#0 |0 0 0 Tab = [LUqup Dust 

#0 |40 |0 0 Tab = LUquy + Phap Perfect fluid 

#0 |40 |40 |0 Tab = Ugly + phap + 2q(aup) Isotropic heat conducting 
non-perfect fluid 

#0 |40 |0 0 | Tap = MUalty + Phav + Mab Anisotropic fluid without 
heat flux 


0 0 0 0 | Tap = bUgtty + Play + 24(auy) + Hap | General anisotropic fluid 


The vector g“ corresponds to heat flux vector or momentum transfer vector 

The traceless symmetric tensor gp corresponds to the anisotropic stress tensor. 

It is to be noted that the values of these quantities refer to the observers uv“. For 
another class of observers these quantities will be different although the fluid will 
be the same! This observation is crucial in understanding the dynamical description 
of a relativistic fluids. 

According to the above analysis we classify the relativistic fluids in terms of their 
dynamical physical variables as shown in Table 23.1. 


23.9 The Dynamical Equations of Motion of a Relativistic 
Fluid: A Simplified Approach 


As we have remarked the dynamics of a relativistic fluid requires the energy 
momentum tensor which is a second rank tensor T@?. The tensor T?? which will 
be build from the available elements we have, that is, the number density (9, the 
4-velocity v* and the Lorentz metric n,,. Due to the results on the Newtonian fluids 
we require this tensor to be symmetric and the equations of motion of the fluid to be 
of the form 7”? » — f“ where f* is the external “force” density — which is not part 
of the fluid and refers to the environment. 

In order to find such a tensor T’? we note that the possible choices we have using 
the available tensors, are the following: 


Tab = Ponabd, Tab = Potay, Tab = Polaltty + Anan. 


The first choice is rejected because it does not contain the 4-velocity u“ of the 
fluid. The second choice is the simplest and it is contained in the last. We choose 
the second because the last contains the unknown quantity A. If necessary we shall 
return to it. 


23.9 The Dynamical Equations of Motion of a Relativistic Fluid: A Simplified. . . 807 


We consider Typ = PoUaup and compute its components in a general Lorentz 


frame &. We have: 
mon(r),0(%5) 


where © denotes tensor product. It follows: 


2 

c cu 

Tua = px ( eo ) (23.73) 
cv" v'vy Js 


where”! py = Por?: We note immediately that the unwanted quantities y have 
been absorbed into the matter density p, which is one of the quantities we want to 
use in our equations. 

We compute next the divergence aa and ne the equations resulting if 


we set it equal to f“. We write for the ae in D, x° = ct and x# = {x, y, z} 
and find: 


dpy 
T™! = Ty +T™, = (pzc"),0+ (pzcv") yp =c E + vio) 


If we demand poe q = 0 (i.e. we assume that there are no sinks or sources) then 
it follows that in & this condition gives: 


a 
— + V(pzv) =0 (23.74) 


which is the continuity equation for the mass density. Obviously equation (23.74) 
is not covariant because T"4 _q 1S a component of the four vector dig However 
the form of the equation has been derived for an arbitrary Lorentz ae x and 
the difference from another frame ©’ will only be in the particular values of the 
non-relativistic quantities py and v not in its form. 

We continue with the spatial components of the vector ia _p: We compute in &: 


i, = ean a re, = (cpsv”).0 +4 (psv“v”) 1 
= cpy,ov” + cpyv'g + psp vv" + ps (v"v") 
= (cpso + Ps.yv" + pyv',)v" + py(cvig + v" wv") 


= (cpx.o + (oxu") yu" + px(culg +" vo"). 


?!Recall that according to our conventions lower indices denote rows and upper indices denote 
columns. 


808 23 Relativistic Fluids in Special Relativity 


It follows that the first part in the rhs is the continuity equation for the mass 
density in & and the second part is the lhs of Euler’s equation of motion in ©. 
Therefore if we assume that the continuity equation is satisfied we find that Euler’s 
equation of motion is satisfied provided the space part of the vector Le satisfies the 
condition: 


TJ =—p" +f" (23.75) 


where p is the isotropic pressure of the fluid, and f” the external force density. We 
conclude that: 


a. The choice of T@? = pP0UqUy defines at most perfect fluids 

b. The time part i es a = 0 of the vector i corresponds to the continuity equation 
(for matter) 

c. The space part T' of the vector re must equal T’’ , = —p’’ + f® in order 
that Euler’s equation of motion is satisfied. 


We note that if the pressure p is a constant then T? = P0Uguy satisfies both 
equations. A perfect fluid for which the pressure p is zero, is described by the 
tensor T?? = pPoUaUuph and we call it dust. Dust is the simplest perfect fluid and 
it is described only in terms of the matter density and the four-velocity. 

The isotropic pressure is an internal element of the Newtonian fluid therefore it 
must be included in the relativistic description of the fluid. This means that we are 
looking for a generalization of T’” which will be such that the Euler’s equation of 
motion will follow from the condition T’“ , = f”. We consider: 


T? = pougty + Sab (23.76) 


where sqgp is a Symmetric tensor such that: 


a. Sap? = 0 (in order to leave the jt q component the same — because we have 
shown that it works) 

b. s°@ | = p’” (in order to have that T’’ , = f* ). 

c. The tensor sgp will be constructed from the available elements: p, Nap, Ua- 


The most general tensor which can be constructed from the elements nap, Ug is: 
Sab = AUgly + Brap- 
Condition Sapue = 0 implies: 
—Ac*ug + Bug=0> B= Ac? 


so that sgy becomes: 


1 
Sab = Ac” (1 ae Math) = Ac*hap (23.77) 


23.9 The Dynamical Equations of Motion of a Relativistic Fluid: A Simplified. . . 809 


where: 


1 
hab = Nab + aatallo (23.78) 


is the tensor which projects normal to the vector u“, that is hapu? = 0. 
We examine next condition s’“ , = p’’. We consider the comoving frame in 


which the four-velocity u4 = () and using that in a Lorentz frame ng, = 
diag(—1, 1, 1, 1) we find: 


1 
hap = diag(-1,1,1,1)+ =( ‘6 )@(* ) =diag (0, 1,1, D. (23.79) 
c2 \0 0 


It follows that in that frame: 
Sab = Ac’diag(0, 1, 1,1) > s* , =c° A“ 


and condition s’“ , = p’” implies Ac* = p (up to a constant which we take it to 
be zero, the case of dust). Therefore we have: 


Sab = Phab 
and we conclude that the tensor T?? we are looking for is: 
T? = pouguy + phan. (23.80) 


Tap has the property that its divergence pe produces both the continuity 
equation and Euler’s equation of motion for a perfect fluid provided we set Vie = 
f%. An equivalent form of the tensor T?? is: 


1 
Te ( + =?) Ua + PNab- (23.81) 


The tensor T’? we call the energy-momentum tensor of a (relativistic) perfect 
fluid. It contains all dynamics of the fluid. 


23.9.1 The Energy Momentum Tensor of a Relativistic Viscous 
Fluid 


The energy momentum tensor of a viscous fluid is a tensor T“? such that the 
vanishing of its divergence Fe will produce both: 


a. The continuity equation for the energy density Pen =0 


810 23 Relativistic Fluids in Special Relativity 


b. The Navier Stokes equation of motion (see (23.4)) : 


Dot Lu mv uh 
a ae =pof" +n” ,— pr. (23.82) 
This equation is written 
Dv" 
paz te = pafh + ant 


The lhs is the Euler equation of motion for a perfect fluid, therefore it will be 
produced from the Tgp = PouUaup + Phap. There remains only the term rae to be 
considered. We define: 


Tab = PoUaty + Phav + hab (23.83) 
where dap is the symmetric stress tensor corresponding to the strain motions in the 


relativistic fluid. In order to compute the form of the tensor @gp we note that dap is 
symmetric, traceless and it is restricted by the condition: 


dapu” = 0. 


Using the 1+ 3 decomposition wrt the 4-velocity vector u“ we find that the most 
general form of the tensor ¢gp is: 


dab = 24(atp) + Tab (23.84) 


where the vector gq = he av (<> qqu* = 0) and the traceless tensor gp is such 
that tgp = hh, Tea or, equivalently, zgpu% = Tap? = 0. We conclude that the 
required form of the energy momentum tensor for a viscous fluid is: 


Tab = PoUaUy + Phav + 2q(aun) + Tap. (23.85) 


which coincides with (23.68). 


23.10 The Electromagnetic Field in Vacuum as a Relativistic 
Fluid 


In Sect. 22.6 we considered the electromagnetic field in empty space as a viscous 
Newtonian fluid and determined the stress tensor in terms of the electric and the 
magnetic field E“”, B“ using Maxwell equations. In this section we will consider he 
electromagnetic field in empty space as a relativistic fluid and determine the energy 
momentum tensor in terms of the four-electric and the four-magnetic fields E“, B®. 


23.10 The Electromagnetic Field in Vacuum as a Relativistic Fluid 811 


The results of this section are the same in General Relativity provided the partial 
derivative is replaced by the covariant derivative. 

In Sect. 13.15 we considered the energy momentum tensor of the electromagnetic 
field propagating in a homogenous and isotropic material to be given by the 
expression (13.236). We note that the energy momentum tensor is not symmetric 
unless the “material” is the empty space where it is symmetric. 

Based on this and the results of the previous section we require the energy 
momentum tensor of the electromagnetic field in empty space to satisfy the 
following requirements: 


a. emT” must be of the form (23.83) 

b. Must satisfy the requirement gy 7" ’ > = J“ where f“ is the Lorentz force. 

c. The EM field tensor Fy, satisfies Maxwell equations (13.57) and (13.58) (in SI 
units): 


Fe? = —u0i" (23.86) 
3Frab,c} _ Fab,c aa Fye.a aa Fea.b = 0. (23.87) 
where j“ = qu* is the four-current. The requirement of charge conservation 


implies the continuity equation (provided there are no sinks or sources): 
ia = 0: (23.88) 


The Lorentz force acting on the four-current is given by the formula 
(see (13.65)): 


f? =F” jy. (23.89) 


Using Maxwell equation (23.86) the Lorentz force can be written as follows: 
a ab » i aby d 
fo =F jp =—-—F' EF," (23.90) 
HO , 


From this we conclude that the required tensor must be quadratic in the tensor 
F® and it can also contain the Lorentz metric Nab. The most general?” form of a 
symmetric second rank tensor satisfying these properties is the following 


Pay ia = AF“ F, b ae Bn?’ (F Fea) (23.91) 


?2Modulo a term with vanishing divergence. 


812 23 Relativistic Fluids in Special Relativity 


where the scalars A, B must be determined from the requirement ¢y TY ne = 
qF°up. We compute the divergence py T?? b 


EMT” ,= AF ,F,°+ AFF,” + 2Bn? F Fea, 
= AFR,” , 4+ F%|AF* ., +280" Feo | 
= Apo ft + F°[AF* ., + 2B Fes.a. 

The second term is written as follows 


Foe [ar a 4 2Bn" Fena| = Fs IBF: a4 Ape, ’] 


| 
Fes [2B (- pach prac) 4 4pech 
_ Fes [2B (- peed + Fae) 4 apach| 
= (4B 4+ A)F.pF%”?. 
Therefore finally 
emT© , = —Apof? + (—4B + A) Fon FO, (23.92) 


The requirement ¢y 7" ‘ p= J” gives: 


-1 -1 
A=—, B= —. 
Ho 410 
We conclude that the required energy momentum tensor of the electromagnetic 
fluid in empty space is: 


1 1 
emT? = an Par.’ + gn te Fa} (23.93) 


We compute next -y7@ in terms of the fields E“, B“ in a frame D using the 
standard expression (13.245) of the field tensor F“? in terms of the fields E%, B@ as 
well as the inverse relations (13.110). The term F cd F.4 is the invariant —2X of the 
electromagnetic field (see (13.59)) therefore: 


ECE 
F Fig = —2X = 2( = BB), (23.94) 
Cc 


23.10 The Electromagnetic Field in Vacuum as a Relativistic Fluid 813 


Concerning the term F’, > F@ we have the following: 


‘ 1 f ; i eee 1 1 

Fe F. b_ 5 (E%u° E‘u") ni Brus| a (Eeu! Ebuc) = Bu | 
1 : ‘ 

Se (E%u‘ = E‘u") (Eeu! _ Ebuc) 
1 a,c ca b kim 1 b b acrs 

ay (Eu — E°u*) {mq BAU — = (Een -—E we) m ” Brus 


1 
socal Brust Bey 


1 . 
= = (E*E?(-c?) — E°E,utu? 
c 
1 ‘ ; 
+3 (Bou neh Bea - Eq? n®"* By) 
1 1 ers 
= aie = aE Eeutw 
1 b E‘ BE ma 1 a E¢ B* m_b 
~ 73 |..ckm Wes 73 T.ckm “wou 


lt cars 


k bd 
= ) U] Ncdkm B;usB un 


The last term gives 


ne" Nedkm Brits Beu™ nP4 = — (898;.68, — 828%58, + d087,55 — 52 5,55 + 57,5758 — 848%, 52) 


x By Us Bey" ht 


(575;5m — 8¢548m — 55,53) Brits Bau 


B, B'( c7)n? BBY c?) B, B’ uu? 
= —c? (BbB* — B,BT ht) 


The quantity Sy = eal en E° Bu is the Poynting vector. Therefore finally 


we have 


E%E? 1 
pep ee pe Ee BP Ue (Sl esa) 
Cc Cc 


(23.95) 


814 23 Relativistic Fluids in Special Relativity 


Then the energy momentum tensor of the electromagnetic field is: 


1 [ E*E? 1. ; 1 (E°E- 
euT@ . B? Be ES E,utu? _ B,B' ne as i B.BS ne? 
Lo c c 2 c 
1 
— (Stu? + SPy4 (23.96) 
Cc 
The terms 


1 1 (ECE : 
— gE Buu! a BeBoh’? _ 5( 2 Cc BB ) nat 


1 1 (ECE ; 1 
= —E°E.usu? — BeBoh® — 5 ( a BB) Ga = aw) 
Cc Cc Cc 


1 1 (EcE 1 1 (ESE 
—-— FE.utu? 4 ° — BB) <utu? — B.Bon® — = “ — BBS \ h® 
cA 2 C2 C2 2 


2 
Cc 
1 (ECE 1 1 (ECE 
( = BB") sun? ( <+B.B°) ni? 
c 


2 c 2 c2 
a 1 E°E. c 1 a,b ab 
=-3( a2 + BoB) (Sut +h ) 
therefore 

1 [E%E? 1 (ECE 1 

euT®? =-—| - + Bat 5 ( r+ BB) (aut +0?) | 
Lo Cc 2 Cc c 
1 

= aa (Stu + Su") (23.97) 


which coincides with (13.257) we found before for matter provided we replace 
D* = €9E“ and Hg = ap Ba pe = = 54 and make use of é9i40 = +. 
The stress tensor is , 


1 [ECE 1 (ECE 
| 3 + BoB >( r+ B.B*) | 
Mol ¢ c 


which coincides with the traceless stress tensor (13.256) and the Maxwell stress 
tensor (13.253). 
The energy momentum tensor gy 7? can be written as follows: 


b b b b b b 
EMI“ = [hemU“u + Pemh* + Ge,u + u "Gen + en 


23.10 The Electromagnetic Field in Vacuum as a Relativistic Fluid 


where: 


Ho \ c 
ae E‘Ec . p pe)! 
PEM = 610 7) ce = grEM 
dem = oS“ 
i fee 
nob = a art B? BA — Henk 


Equation (23.99) is the well known catastatic equation for radiation. 


815 


(23.98) 


(23.99) 


(23.100) 


(23.101) 


